- Linux Virtual Server - a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system (last date on the web page: 2012)
- Linux-VServer - provides virtualization for GNU/Linux systems. This is accomplished by kernel level isolation. It allows to run multiple virtual units at once. Those units are sufficiently isolated to guarantee the required security, but utilize available resources efficiently, as they run on the same kernel. (a precursor to LXC) (last mod 2018)
Saturday, March 23. 2024
Linux Good Old Stuff
Sunday, February 4. 2024
C++ Header File Statistics
From #include <rules>, via hacker news, two compile time options to consider:
- use the preprocessor output (cl /E, gcc -E)
- use the include output (cl /showIncludes, gcc -M), gather the codebase statistics (average size after preprocessing, most included header files, header files with largest payload, etc.)
I've been doing this backwards, but I don't understand why though:
The header file named after the source should be included first (to catch errors in the header)
Wednesday, January 24. 2024
Sleep
SPAND: Sleep Prediction Architecture using Network Dynamics
Sleep behavior significantly impacts health and acts as an indicator of physical and mental well-being. Monitoring and predicting sleep behavior with ubiquitous sensors may therefore assist in both sleep management and tracking of related health conditions. While sleep behavior depends on, and is reflected in the physiology of a person, it is also impacted by external factors such as digital media usage, social network contagion, and the surrounding weather. In this work, we propose SPAND (Sleep Prediction Architecture using Network Dynamics), a system that exploits social contagion in sleep behavior through graph networks and integrates it with physiological and phone data extracted from ubiquitous mobile and wearable devices for predicting next-day sleep labels about sleep duration. Our architecture overcomes the limitations of large-scale graphs containing connections irrelevant to sleep behavior by devising an attention mechanism. The extensive experimental evaluation highlights the improvement provided by incorporating social networks in the model. Additionally, we conduct robustness analysis to demonstrate the system's performance in real-life conditions. The outcomes affirm the stability of SPAND against perturbations in input data. Further analyses emphasize the significance of network topology in prediction performance revealing that users with higher eigenvalue centrality are more vulnerable to data perturbations.
Sunday, January 21. 2024
Boehm Garbage Collection, Cords String Handling
HackerNews had an article about Boehm-Demers-Weiser conservative C/C++ Garbage Collector which leads to A garbage collector for C and C++.
It can be used in garbage collection mode or leak detection mode.
The garbage collector distribution includes a C string (cord) package that provides for fast concatenation and substring operations on long strings. A simple curses- and win32-based editor that represents the entire file as a cord is included as a sample application. From Wikipedia:
Boehm GC is also distributed with a C string handling library called cords. This is similar to ropes in C++ (trees of constant small arrays), but instead of using reference counting for proper deallocation, it relies on garbage collection to free objects. Cords are good at handling very large texts, modifications to them in the middle, slicing, concatenating, and keeping history of changes (undo/redo functionality).
Code can be found at github - The Boehm-Demers-Weiser conservative C/C++ Garbage Collector (bdwgc, also known as bdw-gc, boehm-gc, libgc)
Sunday, January 14. 2024
Trading Evaluation
StockFormer: A Swing Trading Strategy Based on STL Decomposition and Self-Attention Networks
Amidst ongoing market recalibration and increasing investor optimism, the U.S. stock market is experiencing a resurgence, prompting the need for sophisticated tools to protect and grow portfolios. Addressing this, we introduce "Stockformer," a cutting-edge deep learning framework optimized for swing trading, featuring the TopKDropout method for enhanced stock selection. By integrating STL decomposition and self-attention networks, Stockformer utilizes the S&P 500's complex data to refine stock return predictions. Our methodology entailed segmenting data for training and validation (January 2021 to January 2023) and testing (February to June 2023). During testing, Stockformer's predictions outperformed ten industry models, achieving superior precision in key predictive accuracy indicators (MAE, RMSE, MAPE), with a remarkable accuracy rate of 62.39% in detecting market trends. In our backtests, Stockformer's swing trading strategy yielded a cumulative return of 13.19% and an annualized return of 30.80%, significantly surpassing current state-of-the-art models. Stockformer has emerged as a beacon of innovation in these volatile times, offering investors a potent tool for market forecasting. To advance the field and foster community collaboration, we have open-sourced Stockformer, available at StockFormer
CRISIS ALERT:Forecasting Stock Market Crisis Events Using Machine Learning Methods
Historically, the economic recession often came abruptly and disastrously. For instance, during the 2008 financial crisis, the SP 500 fell 46 percent from October 2007 to March 2009. If we could detect the signals of the crisis earlier, we could have taken preventive measures. Therefore, driven by such motivation, we use advanced machine learning techniques, including Random Forest and Extreme Gradient Boosting, to predict any potential market crashes mainly in the US market. Also, we would like to compare the performance of these methods and examine which model is better for forecasting US stock market crashes. We apply our models on the daily financial market data, which tend to be more responsive with higher reporting frequencies. We consider 75 explanatory variables, including general US stock market indexes, SP 500 sector indexes, as well as market indicators that can be used for the purpose of crisis prediction. Finally, we conclude, with selected classification metrics, that the Extreme Gradient Boosting method performs the best in predicting US stock market crisis events.
Alberta Weather & Electric
- Alberta Electric System Operator
- Alberta Current Supply Demand Report
- aeso Real Time Dashboard
- aeso events - with intra-day system marginal price
- AccuWeather
- AESO Transmissiohn Capability Map
Machine Learning
TSPP: A Unified Benchmarking Tool for Time-series Forecasting
While machine learning has witnessed significant advancements, the emphasis has largely been on data acquisition and model creation. However, achieving a comprehensive assessment of machine learning solutions in real-world settings necessitates standardization throughout the entire pipeline. This need is particularly acute in time series forecasting, where diverse settings impede meaningful comparisons between various methods. To bridge this gap, we propose a unified benchmarking framework that exposes the crucial modelling and machine learning decisions involved in developing time series forecasting models. This framework fosters seamless integration of models and datasets, aiding both practitioners and researchers in their development efforts. We benchmark recently proposed models within this framework, demonstrating that carefully implemented deep learning models with minimal effort can rival gradient-boosting decision trees requiring extensive feature engineering and expert knowledge.
Wednesday, January 10. 2024
Parallelism and Concurrency
HPX -- An open source C++ Standard Library for Parallelism and Concurrency
To achieve scalability with today's heterogeneous HPC resources, we need a dramatic shift in our thinking; MPI+X is not enough. Asynchronous Many Task (AMT) runtime systems break down the global barriers imposed by the Bulk Synchronous Programming model. HPX is an open-source, C++ Standards compliant AMT runtime system that is developed by a diverse international community of collaborators called The Ste||ar Group. HPX provides features which allow application developers to naturally use key design patterns, such as overlapping communication and computation, decentralizing of control flow, oversubscribing execution resources and sending work to data instead of data to work. The Ste||ar Group comprises physicists, engineers, and computer scientists; men and women from many different institutions and affiliations, and over a dozen different countries. We are committed to advancing the development of scalable parallel applications by providing a platform for collaborating and exchanging ideas. In this paper, we give a detailed description of the features HPX provides and how they help achieve scalability and programmability, a list of applications of HPX including two large NSF funded collaborations (STORM, for storm surge forecasting; and STAR (OctoTiger) an astro-physics project which runs at 96.8% parallel efficiency on 643,280 cores), and we end with a description of how HPX and the Ste||ar Group fit into the open source community.
TimeGraphs: Graph-based Temporal Reasoning
Many real-world systems exhibit temporal, dynamic behaviors, which are captured as time series of complex agent interactions. To perform temporal reasoning, current methods primarily encode temporal dynamics through simple sequence-based models. However, in general these models fail to efficiently capture the full spectrum of rich dynamics in the input, since the dynamics is not uniformly distributed. In particular, relevant information might be harder to extract and computing power is wasted for processing all individual timesteps, even if they contain no significant changes or no new information. Here we propose TimeGraphs, a novel approach that characterizes dynamic interactions as a hierarchical temporal graph, diverging from traditional sequential representations. Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales. Adopting a self-supervised method, TimeGraphs constructs a multi-level event hierarchy from a temporal input, which is then used to efficiently reason about the unevenly distributed dynamics. This construction process is scalable and incremental to accommodate streaming data. We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset. The results demonstrate both robustness and efficiency of TimeGraphs on a range of temporal reasoning tasks. Our approach obtains state-of-the-art performance and leads to a performance increase of up to 12.2% on event prediction and recognition tasks over current approaches. Our experiments further demonstrate a wide array of capabilities including zero-shot generalization, robustness in case of data sparsity, and adaptability to streaming data flow.
Tuesday, January 9. 2024
EconPapers
Analysis of frequent trading effects of various machine learning models, EconPapers
In recent years, high-frequency trading has emerged as a crucial strategy in stock trading. This study aims to develop an advanced high-frequency trading algorithm and compare the performance of three different mathematical models: the combination of the cross-entropy loss function and the quasi-Newton algorithm, the FCNN model, and the vector machine. The proposed algorithm employs neural network predictions to generate trading signals and execute buy and sell operations based on specific conditions. By harnessing the power of neural networks, the algorithm enhances the accuracy and reliability of the trading strategy. To assess the effectiveness of the algorithm, the study evaluates the performance of the three mathematical models. The combination of the cross-entropy loss function and the quasi-Newton algorithm is a widely utilized logistic regression approach. The FCNN model, on the other hand, is a deep learning algorithm that can extract and classify features from stock data. Meanwhile, the vector machine is a supervised learning algorithm recognized for achieving improved classification results by mapping data into high-dimensional spaces. By comparing the performance of these three models, the study aims to determine the most effective approach for high-frequency trading. This research makes a valuable contribution by introducing a novel methodology for high-frequency trading, thereby providing investors with a more accurate and reliable stock trading strategy.
Measure of Dependence for Financial Time-Series, EconPapers
Assessing the predictive power of both data and models holds paramount significance in time-series machine learning applications. Yet, preparing time series data accurately and employing an appropriate measure for predictive power seems to be a non-trivial task. This work involves reviewing and establishing the groundwork for a comprehensive analysis of shaping time-series data and evaluating various measures of dependence. Lastly, we present a method, framework, and a concrete example for selecting and evaluating a suitable measure of dependence.
C++ Design Patterns for Low-latency Applications Including High-frequency Trading
This work aims to bridge the existing knowledge gap in the optimisation of latency-critical code, specifically focusing on high-frequency trading (HFT) systems. The research culminates in three main contributions: the creation of a Low-Latency Programming Repository, the optimisation of a market-neutral statistical arbitrage pairs trading strategy, and the implementation of the Disruptor pattern in C++. The repository serves as a practical guide and is enriched with rigorous statistical benchmarking, while the trading strategy optimisation led to substantial improvements in speed and profitability. The Disruptor pattern showcased significant performance enhancement over traditional queuing methods. Evaluation metrics include speed, cache utilisation, and statistical significance, among others. Techniques like Cache Warming and Constexpr showed the most significant gains in latency reduction. Future directions involve expanding the repository, testing the optimised trading algorithm in a live trading environment, and integrating the Disruptor pattern with the trading algorithm for comprehensive system benchmarking. The work is oriented towards academics and industry practitioners seeking to improve performance in latency-sensitive applications.
Integrating Tick-level Data and Periodical Signal for High-frequency Market Making
We focus on the problem of market making in high-frequency trading. Market making is a critical function in financial markets that involves providing liquidity by buying and selling assets. However, the increasing complexity of financial markets and the high volume of data generated by tick-level trading makes it challenging to develop effective market making strategies. To address this challenge, we propose a deep reinforcement learning approach that fuses tick-level data with periodic prediction signals to develop a more accurate and robust market making strategy. Our results of market making strategies based on different deep reinforcement learning algorithms under the simulation scenarios and real data experiments in the cryptocurrency markets show that the proposed framework outperforms existing methods in terms of profitability and risk management.
Trade Co-occurrence, Trade Flow Decomposition, and Conditional Order Imbalance in Equity Markets
The time proximity of high-frequency trades can contain a salient signal. In this paper, we propose a method to classify every trade, based on its proximity with other trades in the market within a short period of time, into five types. By means of a suitably defined normalized order imbalance associated to each type of trade, which we denote as conditional order imbalance (COI), we investigate the price impact of the decomposed trade flows. Our empirical findings indicate strong positive correlations between contemporaneous returns and COIs. In terms of predictability, we document that associations with future returns are positive for COIs of trades which are isolated from trades of stocks other than themselves, and negative otherwise. Furthermore, trading strategies which we develop using COIs achieve conspicuous returns and Sharpe ratios, in an extensive experimental setup on a universe of 457 stocks using daily data for a period of three years.
Monday, December 25. 2023
Abstracts
Progressing from Anomaly Detection to Automated Log Labeling and Pioneering Root Cause Analysis - accepted at AIOPS workshop @ICDM 2023
The realm of AIOps is transforming IT landscapes with the power of AI and ML. Despite the challenge of limited labeled data, supervised models show promise, emphasizing the importance of leveraging labels for training, especially in deep learning contexts. This study enhances the field by introducing a taxonomy for log anomalies and exploring automated data labeling to mitigate labeling challenges. It goes further by investigating the potential of diverse anomaly detection techniques and their alignment with specific anomaly types. However, the exploration doesn't stop at anomaly detection. The study envisions a future where root cause analysis follows anomaly detection, unraveling the underlying triggers of anomalies. This uncharted territory holds immense potential for revolutionizing IT systems management. In essence, this paper enriches our understanding of anomaly detection, and automated labeling, and sets the stage for transformative root cause analysis. Together, these advances promise more resilient IT systems, elevating operational efficiency and user satisfaction in an ever-evolving technological landscape.
A Roadmap towards Intelligent Operations for Reliable Cloud Computing Systems
The increasing complexity and usage of cloud systems have made it challenging for service providers to ensure reliability. This paper highlights two main challenges, namely internal and external factors, that affect the reliability of cloud microservices. Afterward, we discuss the data-driven approach that can resolve these challenges from four key aspects: ticket management, log management, multimodal analysis, and the microservice resilience testing approach. The experiments conducted show that the proposed data-driven AIOps solution significantly enhances system reliability from multiple angles.
A Systematic Mapping Study in AIOps
IT systems of today are becoming larger and more complex, rendering their human supervision more difficult. Artificial Intelligence for IT Operations (AIOps) has been proposed to tackle modern IT administration challenges thanks to AI and Big Data. However, past AIOps contributions are scattered, unorganized and missing a common terminology convention, which renders their discovery and comparison impractical. In this work, we conduct an in-depth mapping study to collect and organize the numerous scattered contributions to AIOps in a unique reference index. We create an AIOps taxonomy to build a foundation for future contributions and allow an efficient comparison of AIOps papers treating similar problems. We investigate temporal trends and classify AIOps contributions based on the choice of algorithms, data sources and the target components. Our results show a recent and growing interest towards AIOps, specifically to those contributions treating failure-related tasks (62%), such as anomaly detection and root cause analysis.
AIOps with Data, Analytics, and Intelligent Automation: A Foundational Capability for Modern IT Operations - sponsored by BMC but has useful general background information
Sunday, December 3. 2023
ls notes
ls -srSk ~/Downloads/
- s - size in blocks
- r - reverse order (largest at bottom)
- k - 1024 byte block size
- S - sort by size
ls -d ~/Downloads/*
- Shows path to all files in the directory
Monday, November 27. 2023
SIEM, Incident Management
unstruct.ai - An AI-Enabled, Open-Source Alternative to PagerDuty. With UnStruct.AI, you're not just getting another cybersecurity tool – you're getting an all-in-one powerhouse. Instead of juggling multiple tools and racking up costs for each, get everything under one roof. Whether it's for paging, incident response, analysis, status updates, SLO/uptime monitoring, or a sprinkle of tech magic.
More eBPF
According to the slides from a 2023 Linux Storage, Filesystem, Memory-Management and BPF Summit talk, guests operating through the netkit device (which was called "meta" at that time) are able to attain TCP data-transmission rates that are just as high as can be had by running directly on the host. The performance penalty for running within a guest has, in other words, been entirely removed.
Sunday, November 12. 2023
Protectionism on Garage Door Openers
Chamberlain blocks smart garage door opener from working with smart homes - Chamberlain Group recently made the decision to prevent unauthorized usage of our myQ ecosystem through third-party apps.
Whenever someone says: I can control my garage door from my phone! I point out that what they've really done is:
- ceded all control of their garage door to another entity
- they request action(s) from that entity
- hope the other entity allows that action
One popular fix for people with the MyQ problem is ratgo - You plug a board into the opener and it talks to you locally with no cloud involvement.