git clone https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git cd linux/ git checkout v6.1.113 cp /boot/config-$(uname -r) .config make oldconfig scripts/config --set-str SYSTEM_REVISION "$(git rev-parse --short HEAD)" scripts/config --set-str SYSTEM_SERIAL "$(date +%s)" git revert COMMITID export CONCURRENCY_LEVEL=$(nproc) fakeroot make -j$(nproc) deb-pkg LOCALVERSION=-revertCOMMITID git reset HEAD^ --hard
Sunday, February 9. 2025
how to build kernel .deb from modified git tree
Sunday, January 12. 2025
Talos Install in Preparation for Kubernetes
On my ProxMox machine, I need a more dynamic way of handling containers rather than being limited to the LXC flavour offered by ProxMox. There was a question on reddit to which I supplied an answer about why to use full virtualization for Kubernetes vs running in a container. In summary:
- proxmox is a virtualization engine which enables LXC containers and QEMU/KVM guest virtual machines
- LXC containers are compartmentalized and share direct access to the kernel
- containers are used so as not to pollute your based proxmox installation with more packages and runtimes
- typically, you don't want to nest containers inside containers
- virtual machines run their own kernel/operating system, and are more secure (failure/security) and independent
- since k8s manages docker containers (not LXC containers), and you don't want to run containers of any fashion in containers of another fashion (docker inside lxc), you run a virtual machine with k8s to keep k8s packages and runtime separate from the core proxmox runtime environment
- hence the container inside virtual machine on hypervisor platform
- for security and compartmentalization
I built a basic LXC management container to take a look at talosctl, then used these instructions to perform a few tests:
apt update apt upgrade apt install --no-install-recommends curl # based upon https://www.talos.dev/v1.9/talos-guides/install/talosctl/ curl -sL https://talos.dev/install | sh # based upon https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/ curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256" echo "$(cat kubectl.sha256) kubectl" | sha256sum --check install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl kubectl version --client
I ultimately removed this container, as I started reading the Getting Started Guide. But the above did generate the information that the lastest talosctl version is v1.9.1, which is useful for generating the iso image required.
.... more to come
Reference:
Print / Scan - Rant / Rave - HP / Canon
A couple months ago, the printer ink in my HP OfficeJet Pro 8620 ran low on most of the ink cartridges. Replacing all those cartridges would cost a significant fraction of the replacement cost of a new printer/scanner. I went looking for some cartridges which might cost less. I did find some on Amazon, installed them, and was successful in printing.
A month or two later, I needed to print again (Yes, I don't print all the time). At this point, the printer display showed:
The printhead appears to be missing, not detected, or incorrectly installed. Error Code 0xc19a0023
HP has been pretty loud about not wanting non-branded cartridges installed in their printers - it obviously reduces their own recurring revenue model. It may or may not be a conspiracy that the printer enters this mode on purpose, or it could be that the printhead suddenly failed. No one but HP knows with these kind of opaque error messages and lack of any suitable diagnostics.
After expressing this problem to the third party cartridge supplier, they are sending some new cartridges which they say will make this problem go away. hmm... We'll see.
So, now that I have an over-designed scanner/paper-weight, it is time to go look for alternative solutions.
Given this whole lock-in with cartridges, I'm not in the mood for anyone's cartridges. They seem costly for no added reason other than some additional profit. Fortunately, there are various brands and models on the market which utilize reservoirs - probably due to consumer demand for just the reasons I no longer like/appreciate priinter cartridges.
I went shopping for some non-HP brand. The Best All-in-One Printers for 2025 seemed to be a good starting point. I ended up looking at the Canon models. There are GX20xx, GX40xx, GX60xx all-in-one models. Even at the higher end models, Canon can't seem to make a decent all-in-one model with 'everything' in it. After comparing a bunch of them, I decided on the GX2020, as it seemed to be a newer update, had an ADF for scan and a duplexer for print. Flat scan can only go to about 8.5" x 11", unlike the HP which could do larger.
The next challenge is working with Debian Linux, and having the printer/scanner on a wifi subnet different from my workstation. Typical printer locator utilities utterly fail in this scenario.
For installing the printer, they do have a package which can be installed with the Debian package manager: cnijfilter2-6.70-1-deb.tar.gz. This can be installed with (remember the service restart):
sudo dpkg -i cnijfilter2_6.80-1_amd64.deb sudo service cups restart
Cups printer utilities could then be used to specify the printer by ip address and set it with the PASSTHRU printer queue mode.
The scanner mode wasn't as 'easy' to configure. After installing the Canon package for the scanner, i had to find a Debian Wiki for SaneOverNetwork which said to modify the /etc/sane.d/pixma.conf file and add a line like (there are examples in the file):
bjnp://ip-address
However, the following error message occurred:
# scanimage -L [17:44:27.140613] [bjnp] udp_command: ERROR - no data received (timeout = 10000) [17:44:27.140719] [bjnp] bjnp_init_device_structure: Cannot read mac address, skipping this scanner
When looking for more assistance, there are references to ThierryHFR / scangearmp2. This is someone who has reverse engineered Canon's protocols, wrapped their binaries, and built something which Canon should have done in the first place.
A word of warning, don't try and build your own from the repository. The instructions aren't so clear, and maybe incorrect. Instead, go directly to the linked repositories, download, and install what you need. Do this for both the print and the scanner drivers.
I did a look at the files in the package:
dpkg -c scangearmp2_4.80-1_amd64.deb
There is a file called /etc/sane.d/canon_pixma.conf which provides the ability of providing the ip address of the printer (which I set statically in dhcp based upon the printer's mac address).
My scanner programs, like gscan2pdf, worked right away.
Are there printer / scanners with better Linux support, and with the 'ultimate' combination of ADF/Flatbed scanning with Colour printing with duplex capability?
Saturday, December 14. 2024
Lua
CMake
- CMake: the Good, the Bad, the Weird - ... strongly recommend for any developers who are learning or struggling with CMake to purchase Professional CMake -- I have found it very helpful in explaining things where most other resources haven't, and it is consistently updated with new major versions of CMake.
Saturday, November 2. 2024
arXiv
One of the main limitations for the development and deployment of many Green Radio Frequency Identification (RFID) and Internet of Things (IoT) systems is the access to energy sources. In this aspect batteries are the main option to be used in energy constrained scenarios, but their use is limited to certain cases, either because of the constraints imposed by a reduced-form factor, their limited lifespan, or the characteristics of the environment itself (e.g. operating temperature, risk of burning, need for fast response, sudden voltage variations). In this regard, supercapacitors present an interesting alternative for the previously mentioned type of environment, although, due to their short-term capacity, they must be combined with an alternative energy supply mechanism. Energy harvesting mechanisms, in conjunction with ultra-low-power electronics, supercapacitors and various methods to improve the efficiency of communications, have enabled the emergence of battery-less passive electronic devices such as sensors, actuators or transmitters. This paper presents a novel analysis of the performance of an energy harvesting system based on vibrations for Green RFID and IoT applications in the field of maritime transport. The results show that the proposed system allows for charging half of a 1.2 F supercapacitor in about 72 minutes, providing a stable current of around 210 uA and a power output of 0.38 mW.
Friday, August 30. 2024
Machine Learning
Artificial Neural Network and Deep Learning: Fundamentals and Theory
"Artificial Neural Network and Deep Learning: Fundamentals and Theory" offers a comprehensive exploration of the foundational principles and advanced methodologies in neural networks and deep learning. This book begins with essential concepts in descriptive statistics and probability theory, laying a solid groundwork for understanding data and probability distributions. As the reader progresses, they are introduced to matrix calculus and gradient optimization, crucial for training and fine-tuning neural networks. The book delves into multilayer feed-forward neural networks, explaining their architecture, training processes, and the backpropagation algorithm. Key challenges in neural network optimization, such as activation function saturation, vanishing and exploding gradients, and weight initialization, are thoroughly discussed. The text covers various learning rate schedules and adaptive algorithms, providing strategies to optimize the training process. Techniques for generalization and hyperparameter tuning, including Bayesian optimization and Gaussian processes, are also presented to enhance model performance and prevent overfitting. Advanced activation functions are explored in detail, categorized into sigmoid-based, ReLU-based, ELU-based, miscellaneous, non-standard, and combined types. Each activation function is examined for its properties and applications, offering readers a deep understanding of their impact on neural network behavior. The final chapter introduces complex-valued neural networks, discussing complex numbers, functions, and visualizations, as well as complex calculus and backpropagation algorithms. This book equips readers with the knowledge and skills necessary to design, and optimize advanced neural network models, contributing to the ongoing advancements in artificial intelligence.
Sunday, August 18. 2024
Debian Linux Grub initrd recovery
During the early days of Debian Linux, one could get away with a 100mb boot partition. With the explosion of included firmware files in the initramfs file, the boot directory needs to be 500mb or even 1gb in size. I have a couple older machines I have not yet rebuilt, which have limited size. I resort to copying various initrd.img files in and out as I upgrade kernels or boot into older versions of the kernel.
Sometimes I have forgotten to properly copy an image back into /boot and run update-grub. When I do that, I have to boot into grub and run the grub command line. Some commands that I run are as follows:
Set a variable so when requesting help, you can page through entries:
grub> set pager=1
List the various mount points:
grub> ls
List the files in a particular mount point
grub> ls (hd0,msdos2)/root
Startup commands, depending upon where linux and initrd files are found:
grub> set root=(hd0,msdos2) grub> linux (hd0,msdos1)/vmlinux-6.1.0-15-amd64 root=LABEL=main grub> initrd (hd0,msdos2)/root/initramfs/initrd.img-6.1.0-15-amd64 grub> boot
Tuesday, August 6. 2024
Classical Machine Learning: Seventy Years of Algorithmic Learning Evolution
Classical Machine Learning: Seventy Years of Algorithmic Learning Evolution
Machine learning (ML) has transformed numerous fields, but understanding its foundational research is crucial for its continued progress. This paper presents an overview of the significant classical ML algorithms and examines the state-of-the-art publications spanning twelve decades through an extensive bibliometric analysis study. We analyzed a dataset of highly cited papers from prominent ML conferences and journals, employing citation and keyword analyses to uncover critical insights. The study further identifies the most influential papers and authors, reveals the evolving collaborative networks within the ML community, and pinpoints prevailing research themes and emerging focus areas. Additionally, we examine the geographic distribution of highly cited publications, highlighting the leading countries in ML research. This study provides a comprehensive overview of the evolution of traditional learning algorithms and their impacts. It discusses challenges and opportunities for future development, focusing on the Global South. The findings from this paper offer valuable insights for both ML experts and the broader research community, enhancing understanding of the field's trajectory and its significant influence on recent advances in learning algorithms.
Nowadays, the new network architectures with respect to the traditional network topologies must manage more data, which entails having increasingly robust and scalable network structures. As there is a process of growth, adaptability, and change in traditional data networks, faced with the management of large volumes of information, it is necessary to incorporate the virtualization of network functions in the context of information content networks, in such a way that there is a balance between the user and the provider at cost level and profit, the functions on the network. In turn, NFVs (Network Functions Virtualization) are considered network structures designed based on IT virtualization technologies, which allow virtualizing of the functions that can be found in the network nodes, which are connected through routing tables, which allows offering communication services for various types of customers. Therefore, information-centric networks (IC, unlike traditional data networks which proceed to exchange information between hosts using data packets and TCP/IP communication protocols, use the content of the data for this purpose where the data travels through the network is stored in a routing table located in the CR (Content Router) of the router temporality to be reused later, which allows for reducing operation costs and capital costs . The purpose of this work is to analyze how the virtualization of network functions is integrated into the field of information-centric networks. Also, the advantages and disadvantages of both architectures are considered and presented as a critical analysis when considering the current difficulties and future trends of both network topologies.
OpenLogParser: Unsupervised Parsing with Open-Source Large Language Models
Log parsing is a critical step that transforms unstructured log data into structured formats, facilitating subsequent log-based analysis. Traditional syntax-based log parsers are efficient and effective, but they often experience decreased accuracy when processing logs that deviate from the predefined rules. Recently, large language models (LLM) based log parsers have shown superior parsing accuracy. However, existing LLM-based parsers face three main challenges: 1)time-consuming and labor-intensive manual labeling for fine-tuning or in-context learning, 2)increased parsing costs due to the vast volume of log data and limited context size of LLMs, and 3)privacy risks from using commercial models like ChatGPT with sensitive log information. To overcome these limitations, this paper introduces OpenLogParser, an unsupervised log parsing approach that leverages open-source LLMs (i.e., Llama3-8B) to enhance privacy and reduce operational costs while achieving state-of-the-art parsing accuracy. OpenLogParser first groups logs with similar static text but varying dynamic variables using a fixed-depth grouping tree. It then parses logs within these groups using three components: i)similarity scoring-based retrieval augmented generation: selects diverse logs within each group based on Jaccard similarity, helping the LLM distinguish between static text and dynamic variables; ii)self-reflection: iteratively query LLMs to refine log templates to improve parsing accuracy; and iii) log template memory: stores parsed templates to reduce LLM queries for improved parsing efficiency. Our evaluation on LogHub-2.0 shows that OpenLogParser achieves 25% higher parsing accuracy and processes logs 2.7 times faster compared to state-of-the-art LLM-based parsers. In short, OpenLogParser addresses privacy and cost concerns of using commercial LLMs while achieving state-of-the-arts parsing efficiency and accuracy.
Amman City, Jordan: Toward a Sustainable City from the Ground Up
The idea of smart cities (SCs) has gained substantial attention in recent years. The SC paradigm aims to improve citizens' quality of life and protect the city's environment. As we enter the age of next-generation SCs, it is important to explore all relevant aspects of the SC paradigm. In recent years, the advancement of Information and Communication Technologies (ICT) has produced a trend of supporting daily objects with smartness, targeting to make human life easier and more comfortable. The paradigm of SCs appears as a response to the purpose of building the city of the future with advanced features. SCs still face many challenges in their implementation, but increasingly more studies regarding SCs are implemented. Nowadays, different cities are employing SC features to enhance services or the residents quality of life. This work provides readers with useful and important information about Amman Smart City.
Sunday, July 21. 2024
Copy ISO Image to USB
# dd if=Downloads/iso/debian-testing-amd64-netinst.iso of=/dev/sda bs=1M status=progress conv=fdatasync
'fdatasync' is equivalent to running 'sync' as a second command
watch kern.log to determine on which drive the USB is mounted when inserted
ensure that the USB is not auto-mounted by any other application or service
Sunday, June 30. 2024
Someone Notes
For example, (to name just a few items) a stock option pricing model is useless without:
- holiday calendars
- ex dividend dates
- interest rate curves
- real-time stock prices
- corporate actions database
Thursday, May 30. 2024
Agriculture, Citizens' Assembly, Truth Seeking
Leveraging Time-Series Foundation Models in Smart Agriculture for Soil Moisture Forecasting
The recent surge in foundation models for natural language processing and computer vision has fueled innovation across various domains. Inspired by this progress, we explore the potential of foundation models for time-series forecasting in smart agriculture, a field often plagued by limited data availability. Specifically, this work presents a novel application of $\texttt{TimeGPT}$, a state-of-the-art (SOTA) time-series foundation model, to predict soil water potential ($\psi_\mathrm{soil}$), a key indicator of field water status that is typically used for irrigation advice. Traditionally, this task relies on a wide array of input variables. We explore $\psi_\mathrm{soil}$'s ability to forecast $\psi_\mathrm{soil}$ in: ($i$) a zero-shot setting, ($ii$) a fine-tuned setting relying solely on historic $\psi_\mathrm{soil}$ measurements, and ($iii$) a fine-tuned setting where we also add exogenous variables to the model. We compare $\texttt{TimeGPT}$'s performance to established SOTA baseline models for forecasting $\psi_\mathrm{soil}$. Our results demonstrate that $\texttt{TimeGPT}$ achieves competitive forecasting accuracy using only historical $\psi_\mathrm{soil}$ data, highlighting its remarkable potential for agricultural applications. This research paves the way for foundation time-series models for sustainable development in agriculture by enabling forecasting tasks that were traditionally reliant on extensive data collection and domain expertise.
A citizens' assembly is a group of people who are randomly selected to represent a larger population in a deliberation. While this approach has successfully strengthened democracy, it has certain limitations that suggest the need for assemblies to form and associate more organically. In response, we propose federated assemblies, where assemblies are interconnected, and each parent assembly is selected from members of its child assemblies. The main technical challenge is to develop random selection algorithms that meet new representation constraints inherent in this hierarchical structure. We design and analyze several algorithms that provide different representation guarantees under various assumptions on the structure of the underlying graph.
ChatGPT as the Marketplace of Ideas: Should Truth-Seeking Be the Goal of AI Content Governance?
As one of the most enduring metaphors within legal discourse, the marketplace of ideas has wielded considerable influence over the jurisprudential landscape for decades. A century after the inception of this theory, ChatGPT emerged as a revolutionary technological advancement in the twenty-first century. This research finds that ChatGPT effectively manifests the marketplace metaphor. It not only instantiates the promises envisaged by generations of legal scholars but also lays bare the perils discerned through sustained academic critique. Specifically, the workings of ChatGPT and the marketplace of ideas theory exhibit at least four common features: arena, means, objectives, and flaws. These shared attributes are sufficient to render ChatGPT historically the most qualified engine for actualizing the marketplace of ideas theory.
The comparison of the marketplace theory and ChatGPT merely marks a starting point. A more meaningful undertaking entails reevaluating and reframing both internal and external AI policies by referring to the accumulated experience, insights, and suggestions researchers have raised to fix the marketplace theory. Here, a pivotal issue is: should truth-seeking be set as the goal of AI content governance? Given the unattainability of the absolute truth-seeking goal, I argue against adopting zero-risk policies. Instead, a more judicious approach would be to embrace a knowledge-based alternative wherein large language models (LLMs) are trained to generate competing and divergent viewpoints based on sufficient justifications. This research also argues that so-called AI content risks are not created by AI companies but are inherent in the entire information ecosystem. Thus, the burden of managing these risks should be distributed among different social actors, rather than being solely shouldered by chatbot companies.
Why Algorithms Remain Unjust: Power Structures Surrounding Algorithmic Activity
Algorithms play an increasingly-significant role in our social lives. Unfortunately, they often perpetuate social injustices while doing so. The popular means of addressing these algorithmic injustices has been through algorithmic reformism: fine-tuning the algorithm itself to be more fair, accountable, and transparent. While commendable, the emerging discipline of critical algorithm studies shows that reformist approaches have failed to curtail algorithmic injustice because they ignore the power structure surrounding algorithms. Heeding calls from critical algorithm studies to analyze this power structure, I employ a framework developed by Erik Olin Wright to examine the configuration of power surrounding Algorithmic Activity: the ways in which algorithms are researched, developed, trained, and deployed within society. I argue that the reason Algorithmic Activity is unequal, undemocratic, and unsustainable is that the power structure shaping it is one of economic empowerment rather than social empowerment. For Algorithmic Activity to be socially just, we need to transform this power configuration to empower the people at the other end of an algorithm. To this end, I explore Wright's symbiotic, interstitial, and raptural transformations in the context of Algorithmic Activity, as well as how they may be applied in a hypothetical research project that uses algorithms to address a social issue. I conclude with my vision for socially just Algorithmic Activity, asking that future work strives to integrate the proposed transformations and develop new mechanisms for social empowerment.
Sunday, May 26. 2024
Useful Debian Packaging Query
$ ucfq /etc/ssh/sshd_config Configuration file Package Exists Changed /etc/ssh/sshd_config openssh-server Yes No
Tuesday, April 2. 2024
Papers
The State of Lithium-Ion Battery Health Prognostics in the CPS Era
Lithium-ion batteries (Li-ion) have revolutionized energy storage technology, becoming integral to our daily lives by powering a diverse range of devices and applications. Their high energy density, fast power response, recyclability, and mobility advantages have made them the preferred choice for numerous sectors. This paper explores the seamless integration of Prognostics and Health Management within batteries, presenting a multidisciplinary approach that enhances the reliability, safety, and performance of these powerhouses. Remaining useful life (RUL), a critical concept in prognostics, is examined in depth, emphasizing its role in predicting component failure before it occurs. The paper reviews various RUL prediction methods, from traditional models to cutting-edge data-driven techniques. Furthermore, it highlights the paradigm shift toward deep learning architectures within the field of Li-ion battery health prognostics, elucidating the pivotal role of deep learning in addressing battery system complexities. Practical applications of PHM across industries are also explored, offering readers insights into real-world implementations.This paper serves as a comprehensive guide, catering to both researchers and practitioners in the field of Li-ion battery PHM.
The New Agronomists: Language Models are Experts in Crop Management, github
Crop management plays a crucial role in determining crop yield, economic profitability, and environmental sustainability. Despite the availability of management guidelines, optimizing these practices remains a complex and multifaceted challenge. In response, previous studies have explored using reinforcement learning with crop simulators, typically employing simple neural-network-based reinforcement learning (RL) agents. Building on this foundation, this paper introduces a more advanced intelligent crop management system. This system uniquely combines RL, a language model (LM), and crop simulations facilitated by the Decision Support System for Agrotechnology Transfer (DSSAT). We utilize deep RL, specifically a deep Q-network, to train management policies that process numerous state variables from the simulator as observations. A novel aspect of our approach is the conversion of these state variables into more informative language, facilitating the language model's capacity to understand states and explore optimal management practices. The empirical results reveal that the LM exhibits superior learning capabilities. Through simulation experiments with maize crops in Florida (US) and Zaragoza (Spain), the LM not only achieves state-of-the-art performance under various evaluation metrics but also demonstrates a remarkable improvement of over 49\% in economic profit, coupled with reduced environmental impact when compared to baseline methods.
DHNet: A Distributed Network Architecture for Smart Home
With the increasing popularity of smart homes, more and more devices need to connect to home networks. Traditional home networks mainly rely on centralized networking, where an excessive number of devices in the centralized topology can increase the pressure on the central router, potentially leading to decreased network performance metrics such as communication latency. To address the latency performance issues brought about by centralized networks, this paper proposes a new network system called DHNet, and designs an algorithm for clustering networking and communication based on vector routing. Communication within clusters in a simulated virtual environment achieves a latency of approximately 0.7 milliseconds. Furthermore, by directly using the first non-"lo" network card address of a device as the protocol's network layer address, the protocol avoids the several tens of milliseconds of access latency caused by DHCP. The integration of service discovery functionality into the network layer protocol is achieved through a combination of "server-initiated service push" and "client request + server reply" methods. Compared to traditional application-layer DNS passive service discovery, the average latency is reduced by over 50%. The PVH protocol is implemented in the user space using the Go programming language, with implementation details drawn from Google's gVisor project. The code has been ported from x86\_64 Linux computers to devices such as OpenWrt routers and Android smartphones. The PVH protocol can communicate through "tunnels" to provide IP compatibility, allowing existing applications based on TCP/IP to communicate using the PVH protocol without requiring modifications to their code.
Saturday, March 23. 2024
Linux Good Old Stuff
- Linux Virtual Server - a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system (last date on the web page: 2012)
- Linux-VServer - provides virtualization for GNU/Linux systems. This is accomplished by kernel level isolation. It allows to run multiple virtual units at once. Those units are sufficiently isolated to guarantee the required security, but utilize available resources efficiently, as they run on the same kernel. (a precursor to LXC) (last mod 2018)