FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making
Financial decision-making presents unique challenges for language models, demanding temporal reasoning, adaptive risk assessment, and responsiveness to dynamic events. While large language models (LLMs) show strong general reasoning capabilities, they often fail to capture behavioral patterns central to human financial decisions-such as expert reliance under information asymmetry, loss-averse sensitivity, and feedback-driven temporal adjustment. We propose FinHEAR, a multi-agent framework for Human Expertise and Adaptive Risk-aware reasoning. FinHEAR orchestrates specialized LLM-based agents to analyze historical trends, interpret current events, and retrieve expert-informed precedents within an event-centric pipeline. Grounded in behavioral economics, it incorporates expert-guided retrieval, confidence-adjusted position sizing, and outcome-based refinement to enhance interpretability and robustness. Empirical results on curated financial datasets show that FinHEAR consistently outperforms strong baselines across trend prediction and trading tasks, achieving higher accuracy and better risk-adjusted returns.
Emerging economies, particularly the MINT countries (Mexico, Indonesia, Nigeria, and T\"urkiye), are gaining influence in global stock markets, although they remain susceptible to the economic conditions of developed countries like the G7 (Canada, France, Germany, Italy, Japan, the United Kingdom, and the United States). This interconnectedness and sensitivity of financial markets make understanding these relationships crucial for investors and policymakers to predict stock price movements accurately. To this end, we examined the main stock market indices of G7 and MINT countries from 2012 to 2024, using a recent graph neural network (GNN) algorithm called multivariate time series forecasting with graph neural network (MTGNN). This method allows for considering complex spatio-temporal connections in multivariate time series. In the implementations, MTGNN revealed that the US and Canada are the most influential G7 countries regarding stock indices in the forecasting process, and Indonesia and T\"urkiye are the most influential MINT countries. Additionally, our results showed that MTGNN outperformed traditional methods in forecasting the prices of stock market indices for MINT and G7 countries. Consequently, the study offers valuable insights into economic blocks' markets and presents a compelling empirical approach to analyzing global stock market dynamics using MTGNN.
An Efficient deep learning model to Predict Stock Price Movement Based on Limit Order Book
In high-frequency trading (HFT), leveraging limit order books (LOB) to model stock price movements is crucial for achieving profitable outcomes. However, this task is challenging due to the high-dimensional and volatile nature of the original data. Even recent deep learning models often struggle to capture price movement patterns effectively, particularly without well-designed features. We observed that raw LOB data exhibits inherent symmetry between the ask and bid sides, and the bid-ask differences demonstrate greater stability and lower complexity compared to the original data. Building on this insight, we propose a novel approach in which leverages the Siamese architecture to enhance the performance of existing deep learning models. The core idea involves processing the ask and bid sides separately using the same module with shared parameters. We applied our Siamese-based methods to several widely used strong baselines and validated their effectiveness using data from 14 military industry stocks in the Chinese A-share market. Furthermore, we integrated multi-head attention (MHA) mechanisms with the Long Short-Term Memory (LSTM) module to investigate its role in modeling stock price movements. Our experiments used raw data and widely used Order Flow Imbalance (OFI) features as input with some strong baseline models. The results show that our method improves the performance of strong baselines in over 75$% of cases, excluding the Multi-Layer Perception (MLP) baseline, which performed poorly and is not considered practical. Furthermore, we found that Multi-Head Attention can enhance model performance, particularly over shorter forecasting horizons.
Reinforcement Learning for Stock Transactions
Much research has been done to analyze the stock market. After all, if one can determine a pattern in the chaotic frenzy of transactions, then they could make a hefty profit from capitalizing on these insights. As such, the goal of our project was to apply reinforcement learning (RL) to determine the best time to buy a stock within a given time frame. With only a few adjustments, our model can be extended to identify the best time to sell a stock as well. In order to use the format of free, real-world data to train the model, we define our own Markov Decision Process (MDP) problem. These two papers [5] [6] helped us in formulating the state space and the reward system of our MDP problem. We train a series of agents using Q-Learning, Q-Learning with linear function approximation, and deep Q-Learning. In addition, we try to predict the stock prices using machine learning regression and classification models. We then compare our agents to see if they converge on a policy, and if so, which one learned the best policy to maximize profit on the stock market.
Evolution of Alpha in Finance Harnessing Human Insight and LLM Agents
The pursuit of alpha returns that exceed market benchmarks has undergone a profound transformation, evolving from intuition-driven investing to autonomous, AI powered systems. This paper introduces a comprehensive five stage taxonomy that traces this progression across manual strategies, statistical models, classical machine learning, deep learning, and agentic architectures powered by large language models (LLMs). Unlike prior surveys focused narrowly on modeling techniques, this review adopts a system level lens, integrating advances in representation learning, multimodal data fusion, and tool augmented LLM agents. The strategic shift from static predictors to contextaware financial agents capable of real time reasoning, scenario simulation, and cross modal decision making is emphasized. Key challenges in interpretability, data fragility, governance, and regulatory compliance areas critical to production deployment are examined. The proposed taxonomy offers a unified framework for evaluating maturity, aligning infrastructure, and guiding the responsible development of next generation alpha systems.
A Deep Learning Approach to Anomaly Detection in High-Frequency Trading Data
his paper proposes an algorithm based on a staged sliding window Transformer architecture to detect abnormal behaviors in the microstructure of the foreign exchange market, focusing on high-frequency EUR/USD trading data. The method captures multi-scale temporal features through a staged sliding window, extracts global and local dependencies by combining the self-attention mechanism and weighted attention mechanism of the Transformer, and uses a classifier to identify abnormal events. Experimental results on a real high-frequency dataset containing order book depth, spread, and trading volume show that the proposed method significantly outperforms traditional machine learning (such as decision trees and random forests) and deep learning methods (such as MLP, CNN, RNN, LSTM) in terms of accuracy (0.93), F1-Score (0.91), and AUC-ROC (0.95). Ablation experiments verify the contribution of each component, and the visualization of order book depth and anomaly detection further reveals the effectiveness of the model under complex market dynamics. Despite the false positive problem, the model still provides important support for market supervision. In the future, noise processing can be optimized and extended to other markets to improve generalization and real-time performance.
This project addresses the challenge of automated stock trading, where traditional methods and direct reinforce- ment learning (RL) struggle with market noise, complex- ity, and generalization. Our proposed solution is an in- tegrated deep learning framework combining a Convolu- tional Neural Network (CNN) to identify patterns in tech- nical indicators formatted as images, a Long Short-Term Memory (LSTM) network to capture temporal dependen- cies across both price history and technical indicators, and a Deep Q-Network (DQN) agent which learns the optimal trading policy (buy, sell, hold) based on the features ex- tracted by the CNN and LSTM. The CNN and LSTM act as sophisticated feature extractors, feeding processed in- formation to the DQN, which learns the optimal trading policy (buy, sell, hold) through RL. We trained and evalu- ated this model on historical daily stock data, using distinct periods for training, testing, and validation. Performance was assessed by comparing the agent’s returns and risk on out-of-sample test data against baseline strategies, includ- ing passive buy-and-hold approaches. This analysis, along with insights gained from explainability techniques into the agent’s decision-making process, aimed to demonstrate the effectiveness of combining specialized deep learning archi- tectures, document challenges encountered, and potentially uncover learned market insights
Predicting Stock Prices Using Permutation Decision Trees And Strategic Trailing
Qe explore the application of Permutation Decision Trees (PDT) and strategic trailing for predicting stock market movements and executing profitable trades in the Indian stock market. We focus on high-frequency data using 5-minute candlesticks for the top 50 stocks listed in the NIFTY 50 index. We implement a trading strategy that aims to buy stocks at lower prices and sell them at higher prices, capitalizing on short-term market fluctuations. Due to regulatory constraints in India, short selling is not considered in our strategy. The model incorporates various technical indicators and employs hyperparameters such as the trailing stop-loss value and support thresholds to manage risk effectively. Our results indicate that the proposed trading bot has the potential to outperform the market average and yield returns higher than the risk-free rate offered by 10-year Indian government bonds. We trained and tested data on a 60 day dataset provided by Yahoo Finance. Specifically, 12 days for testing and 48 days for training. Our bot based on permutation decision tree achieved a profit of 1.3468% over a 12-day testing period, where as a bot based on LSTM gave a return of 0.1238% over a 12-day testing period and a bot based on RNN gave a return of 0.3096% over a 12-day testing period. All of the bots outperform the buy-and-hold strategy, which resulted in a loss of 2.2508%
Deep Q-Network (DQN) multi-agent reinforcement learning (MARL) for Stock Trading
This project addresses the challenge of automated stock trading, where traditional methods and direct reinforcement learning (RL) struggle with market noise, complexity, and generalization. Our proposed solution is an integrated deep learning framework combining a Convolutional Neural Network (CNN) to identify patterns in technical indicators formatted as images, a Long Short-Term Memory (LSTM) network to capture temporal dependencies across both price history and technical indicators, and a Deep Q-Network (DQN) agent which learns the optimal trading policy (buy, sell, hold) based on the features extracted by the CNN and LSTM.
Resolving Latency and Inventory Risk in Market Making with Reinforcement Learning
The latency of the exchanges in Market Making (MM) is inevitable due to hardware limitations, system processing times, delays in receiving data from exchanges, the time required for order transmission to reach the market, etc. Existing reinforcement learning (RL) methods for Market Making (MM) overlook the impact of these latency, which can lead to unintended order cancellations due to price discrepancies between decision and execution times and result in undesired inventory accumulation, exposing MM traders to increased market risk. Therefore, these methods cannot be applied in real MM scenarios. To address these issues, we first build a realistic MM environment with random delays of 30-100 milliseconds for order placement and market information reception, and implement a batch matching mechanism that collects orders within every 500 milliseconds before matching them all at once, simulating the batch auction mechanisms adopted by some exchanges. Then, we propose Relaver, an RL-based method for MM to tackle the latency and inventory risk issues. The three main contributions of Relaver are: i) we introduce an augmented state-action space that incorporates order hold time alongside price and volume, enabling Relaver to optimize execution strategies under latency constraints and time-priority matching mechanisms, ii) we leverage dynamic programming (DP) to guide the exploration of RL training for better policies, iii) we train a market trend predictor, which can guide the agent to intelligently adjust the inventory to reduce the risk. Extensive experiments and ablation studies on four real-world datasets demonstrate that \textsc{Relaver} significantly improves the performance of state-of-the-art RL-based MM strategies across multiple metrics.