Papers about algorithmic trading#
This section contains various research papers related to algorithmic trading.
Note
AI-related papers have their own AI section.
An Investor’s Guide to Crypto#
We provide practical insights for investors seeking exposure to the growing cryptocurrency space. Today, crypto is much more than just bitcoin, which historically dominated the space but accounted for just a 21% share of total crypto trading volume in 2021. We discuss a wide variety of tokens, highlighting both their functionality and their investment properties. We critically compare popular valuation methods. We contrast buy-and-hold investing with more active styles. We only deem return data from 2017 representative, but the use of intraday data boosts statistical power. Underlying crypto performance has been notoriously volatile, but volatility-targeting methods are effective at controlling risk, and trend-following strategies have performed well. Crypto assets display a low correlation with traditional risky assets in normal times, but the correlation also rises in the left tail of these risky assets. Finally, we detail important custody and regulatory considerations for institutional investors.
Low-volatility strategies for highly liquid cryptocurrencies#
Managing extreme price fluctuations in cryptocurrency markets are of central importance for investors in this market segment. Using a sample of highly liquid cryptocurrencies from January 2017 to June 2021, this paper proposes a dynamic investment strategy that selects cryptocurrencies based on their historical volatility and is complemented by a simple stop-loss rule. Our results reveal that investing in highly concentrated low volatility cryptocurrency portfolios with six to twelve months volatility look-back and holding period generate statistically significant excess returns. By including a simple stop-loss rule, the downside risk of cryptocurrency portfolios is reduced markedly, and the Sharpe ratios are improved significantly.
How to avoid overfitting trading strategies#
Running a lossy trading strategy would be a very costly mistake, so we spend a lot of effort on assessing the expected performance of our strategies. This task gets harder when we have limited data for this evaluation or when we experiment with the strategy for a longer time and risk manually overfitting the strategy on the same out-of-sample data.
An Efficient Algorithm for Optimal Routing Through Constant Function Market Makers#
Constant function market makers (CFMMs) such as Uniswap have facilitated trillions of dollars of digital asset trades and have billions of dollars of liquidity. One natural question is how to optimally route trades across a network of CFMMs in order to ensure the largest possible utility (as specified by a user). We present an efficient algorithm, based on a decomposition method, to solve the problem of optimally executing an order across a network of decentralized exchanges
Automated Market Making and Arbitrage Profits in the Presence of Fees#
We consider the impact of trading fees on the profits of arbitrageurs trading against an automated marker marker (AMM) or, equivalently, on the adverse selection incurred by liquidity providers due to arbitrage. We extend the model of Milionis et al. [2022] for a general class of two asset AMMs to both introduce fees and discrete Poisson block generation times. In our setting, we are able to compute the expected instantaneous rate of arbitrage profit in closed form. When the fees are low, in the fast block asymptotic regime, the impact of fees takes a particularly simple form: fees simply scale down arbitrage profits by the fraction of time that an arriving arbitrageur finds a profitable trade.
Momentum and trend following trading strategies for currencies and bitcoin#
Momentum trading strategies are thoroughly described in the academic literature and used in many trading strategies by hedge funds, asset managers, and proprietary traders. Baz et al. (2015) describe a momentum strategy for different asset classes in great detail from a practitioner’s point of view. Using a geometric Brownian Motion for the dynamics of the returns of financial instruments, we extensively explain the motivation and background behind each step of a momentum trading strategy. Constants and parameters that are used for the practical implementation are derived in a theoretical setting and deviations from those used in Baz et al. (2015) are shown. The trading signal is computed as a mixture of exponential moving averages with different time horizons. We give a statistical justification for the optimal selection of time horizons. Furthermore, we test our approach on global currency markets, including G10 currencies, emerging market currencies, and cryptocurrencies. Both a time series portfolio and a cross-sectional portfolio are considered. We find that the strategy works best for traditional fiat currencies when considering a time series based momentum strategy. For cryptocurrencies, a cross-sectional approach is more suitable. The momentum strategy exhibits higher Sharpe ratios for more volatile currencies. Thus, emerging market currencies and cryptocurrencies have better performances than the G10 currencies. This is the first comprehensive study showing both the underlying statistical reasons of how such trading strategies are constructed in the industry as well as empirical results using a large universe of currencies, including cryptocurrencies.
Momentum trading in cryptocurrencies: short-term returns and diversification benefits#
We test for the presence of momentum effects in cryptocurrency market and estimate dynamic conditional correlations (DCCs) of returns between momentum portfolios of cryptocurrencies and traditional assets. First, investment portfolios are constructed adherent to the classic J/K momentum strategy, using daily data from twelve cryptocurrencies for over a period of three years. We identify the existence of momentum effect, which is highly significant for short-term portfolios but disappears over the longer term. Second, we show that cross correlations of weekly returns between momentum portfolio of cryptocurrencies and traditional assets are unlike correlations of returns between traditional assets. Third, we find that momentum portfolios of cryptocurrencies not only offer diversification benefits but also can be a hedge and safe haven for traditional assets.
Pure Momentum in Cryptocurrency Markets#
Momentum is one of the most widespread, persistent, and puz- zling phenomenon in asset pricing. The prevailing explanation for momentum is that investors under-react to new information, and thus asset prices tend to drift over time. We use a unique fea- ture of cryptocurrency markets: the fact that they are open 24/7, and report returns over the last 24 hours. Thus, the one-day re- turn is subject to predictable fluctuations based on the removal of lagged information. We show that investors respond positively to changes in reported returns that are unrelated to any new release of information, or change in the asset fundamentals. We call this behavioral anomaly “Pure Momentum”.
Dissecting Investment Strategies in the Cross Section and Time Series#
Jamil Baz, Nicolas Granger, Campbell R. Harvey, Nicolas Le Roux, and Sandy Rattray compare time-series and cross-sectional implementations of carry, momentum, and value across asset classes. Rather than treating the two portfolio constructions as interchangeable wrappers around the same anomaly, the paper studies when each representation works best and how the underlying market environment changes the relative edge.
Our summary: this is one of the clearest practitioner-academic bridges in the momentum literature. The important contribution is not merely that both time-series and cross-sectional versions can work, but that they load on different economic structures. Time-series approaches are better aligned with persistent directional trends, while cross-sectional portfolios often shine when dispersion across assets is more informative than the market’s aggregate direction. For anyone designing trend systems, this paper is a reminder that “signal” and “portfolio construction” are inseparable design choices.
Key metrics: the paper does not reduce the result to a single headline Sharpe ratio because the focus is comparative across carry, momentum, and value in both time-series and cross-sectional form. Its main empirical claim is that market conditions materially affect which implementation dominates, and that the choice is strategy- and regime-dependent rather than universal.
Can Day Trading Really Be Profitable?#
The validity of day trading as a long-term consistent and uncorrelated source of income for traders and investors is a matter of debate. In this paper, we investigate the profitability of the well-known Opening Range Breakout (ORB) strategy during the period of 2016 to 2023. This period encompasses two bear markets and a few events with abnormal volatility. Our results suggest that with the proper use of leverage or leveraged products (such as 3x leveraged ETFs), day trading can empirically produce significant returns when compared to a standard buy and hold strategy on benchmark indexes in the US public equity markets (Nasdaq or NYSE). Without any loss of generality, we studied the results of an ORB strategy implemented in QQQ. By comparing the results of the active day trading approach with a passive exposure in QQQ, we prove that it is possible for the ORB portfolio to significantly outperform the passive investment. In fact, the day trading portfolio produced an annualized alpha of 33% (net of commissions). Nevertheless, due to leverage constraints enforced by brokers, an active trader would have capped the full upside potential given by the ORB strategy. To overcome this issue, we introduced the use of TQQQ, a leveraged ETF of QQQ, which allows day traders to fully exploit the benefit of the active strategy while adhering to leverage constraints. The resulting portfolio would have earned an outstanding return of 1,484% during the same period of 2016 to 2023, while an investment in the QQQ ETF would have earned only 169%.
Multi-source aggregated classification for stock price movement prediction#
Predicting stock price movements is a challenging task. Previous studies mostly used numerical features and news sentiments of target stocks to predict stock price movements. However, their semantics-based sentiment analysis is sub-optimal to represent real market sentiments. Moreover, only considering the information of target companies is insufficient because the stock prices of target companies can be affected by their related companies. Thus, we propose a novel Multi-source Aggregated Classification (MAC) method for stock price movement prediction. MAC incorporates the numerical features and market-driven news sentiments of target stocks, as well as the news sentiments of their related stocks. To better represent real market sentiments from the news, we pre-train an embedding feature generator by fitting the news to real stock price movements. Embeddings given by the pre-trained sentiment classifier can represent the sentiment information in vector space. Moreover, MAC introduces a graph convolutional network to capture the news effects of related companies on the target stock. Finally, MAC can predict stock price movements for the next trading day based on the aforementioned features. Extensive experiments prove that MAC outperforms state-of-the-art baselines in stock price movement prediction, Sharpe Ratio, and backtesting trading incomes
Cryptocurrencies: Stylized Facts and Risk Based Momentum Investing#
The motivation of this research is in two folds, to understand the distributional characteristics of cryptocurrencies by means of stylized facts, and also to assess the feasibility of risk based and trend following approaches to investing in this asset class. Cryptocurrencies are more of a recent phenomenon, unlike the traditional asset classes. This poses an explicit constraint on the availability of longer history and also reliability of investment performance. Acknowledging such constraint, I focus my analysis based on the few years of data that is available.
151 Trading Strategies#
We provide detailed descriptions, including over 550 mathematical formulas, for over 150 trading strategies across a host of asset classes (and trading styles). This includes stocks, options, fixed income, futures, ETFs, indexes, commodities, foreign exchange, convertibles, structured assets, volatility (as an asset class), real estate, distressed assets, cash, cryptocurrencies, miscellany (such as weather, energy, inflation), global macro, infrastructure, and tax arbitrage. Some strategies are based on machine learning algorithms (such as artificial neural networks, Bayes, k-nearest neighbors). We also give: source code for illustrating out-of-sample backtesting with explanatory notes; around 2,000 bibliographic references; and over 900 glossary, acronym and math definitions. The presentation is intended to be descriptive and pedagogical. This is the complete version of the book.
Cryptocurrency trading: A systematic mapping study#
This systematic mapping examines the current state of cryptocurrency trading research.
This study observes a recent increase in high-quality research and international collaboration in cryptocurrency trading.
This study notes a shift towards practical applications in cryptocurrency trading research, particularly in AI-driven prediction and automated trading.
This study highlights the diverse data types and inputs employed in cryptocurrency trading systems, with emphasis on the prevalent use of neural networks and deep learning algorithms.
Clustering in Cardinality-Constrained Portfolio Optimization#
In portfolio optimization, efficiently managing large pools of assets while adhering to car- dinality constraints presents a significant challenge. We propose a novel portfolio optimization framework that combines cardinality constraints with the classical Markowitz mean-variance model, using clustering to reduce dimensionality and achieve an optimal balance of risk and return. We use spectral clustering to group the residual returns of stocks. This method reveals natural groupings of assets based on their returns and correlations, enhancing our understand- ing and categorization of assets, which is crucial for efficiently reducing the optimization space and dimensionality
Regularised jump models for regime identification and feature selection#
A regime modelling framework can be employed to address the complexities of financial markets. Under the framework, market periods are grouped into distinct regimes, each distinguished by similar statistical characteristics. Regimes in financial markets are not directly observable but are often manifested in market and macroeconomic variables. The objective of regime modelling is to accurately identify the active regime from these variables at a point in time, a process known as regime identification.
One way to enhance the accuracy of regime identification is to select features that are most responsible for statistical differences between regimes, a process known as feature selection. Feature selection is also capable of both enhancing the interpretability of outputs from regime models, and substantially reducing the computational time required to calibrate regime models.
Models based on the Jump Model framework have recently been developed to address the joint problem of regime identification and feature selection. In the following work, we propose a new set of models called Regularised Jump Models that are founded upon the Jump Model framework.
These models perform feature selection that is more interpretable than that from the Sparse Jump Model, a model proposed in the literature pertaining to the Jump Model framework. Through a simulation experiment, we find evidence that these new models outperform the Standard and Sparse Jump Models, both in terms of regime identification and feature selection.
Dynamic Asset Allocation with Asset-Specific Regime Forecasts#
This article introduces a novel hybrid regime identification-forecasting framework designed to enhance multi-asset portfolio construction by integrating asset-specific regime forecasts. Unlike traditional approaches that focus on broad economic regimes affecting the entire asset universe, our framework leverages both unsupervised and supervised learning to generate tailored regime forecasts for individual assets. Initially, we use the statistical jump model, a robust unsupervised regime identification model, to derive regime labels for historical periods, classifying them into bullish or bearish states based on features extracted from an asset return series. Following this, a supervised gradient-boosted decision tree classifier is trained to predict these regimes using a combination of asset-specific return features and cross-asset macro-features. We apply this framework individually to each asset in our universe. Subsequently, return and risk forecasts which incorporate these regime predictions are input into Markowitz mean-variance optimization to determine optimal asset allocation weights. We demonstrate the efficacy of our approach through an empirical study on a multi-asset portfolio comprising twelve risky assets, including global equity, bond, real estate, and commodity indexes spanning from 1991 to 2023. The results consistently show outperformance across various portfolio models, including minimum-variance, mean-variance, and naive-diversified portfolios, highlighting the advantages of integrating asset-specific regime forecasts into dynamic asset allocation.
Optimal Factor Timing in a High-Dimensional Setting#
We develop a framework for equity factor timing in a high-dimensional setting when the number of factors and factor return predictors can be large. To ensure good out-of-sample performance, the approach is disciplined by shrinkage that effectively expresses a degree of skepticism about outsized gains from timing. In our empirical application, the predictors include macroeconomic variables and factor-specific characteristics spreads between the long and short legs of the factors. We find sizable gains from timing equity factors, including for factors constructed only from large-cap stocks.
Optimal Allocation to Cryptocurrencies in Diversified Portfolios#
We apply four quantitative methods for optimal allocation to Bitcoin and Ether cryptocurrencies within alternative and balanced portfolios including metrics for portfolio diversification, expected risk-returns, and skewness of returns distribution. Using roll-forward historical simulations, we show that all four allocation methods produce a persistent positive allocation to Bitcoin and Ether in alternative and balanced portfolios with a median allocation of about 2.7%. We conclude that core cryptocurrencies may provide positive contribution to risk-adjusted performances of broad investment portfolios. We emphasize the diversification benefits of cryptocurrencies as an asset class within broad risk-managed portfolios with systematic re-balancing.
Catching Crypto Trends; A Tactical Approach for Bitcoin and Altcoins#
In recent years, cryptocurrencies have attracted significant attention from both retail traders and large institutional investors. As their involvement in digital assets grows, so does their interest in active and risk-aware investment frameworks. This paper applies a well-established trend-following methodology, successfully deployed for decades in traditional asset classes, to Bitcoin, and then extends the analysis to a comprehensive, survivorship bias-free dataset covering all cryptocurrencies traded since 2015, to evaluate whether its robustness persists in the emerging digital asset space. We propose an ensemble approach that aggregates multiple Donchian channel-based trend models, each calibrated with different lookback periods, into a single signal, as well as a volatility-based position sizing method. This model, applied to a rotational portfolio of the top 20 most liquid coins, achieved notable net-of-fees returns, with a Sharpe ratio above 1.5 and an annualized alpha of 10.8% versus Bitcoin. While assessing the impact of transaction costs, we propose a straightforward yet effective portfolio technique to mitigate these expenses. Finally, we investigate correlations between crypto-focused trend-following strategies and those applied to traditional asset classes, concluding with a discussion on how investors can execute the proposed strategy through both on-chain and off-chain implementations.
Does Trend-Following Still Work on Stocks?#
This paper revisits and extends the results presented in 2005 by Wilcox and Crittenden in a white paper titled Does Trend Following Work on Stocks? Leveraging a survivorship-bias-free dataset of all liquid U.S. stocks from 1950 through November 2024, we examine more than 66,000 simulated long-only trend trades. Our results confirm a highly skewed profit distribution, with less than 7% of trades driving the cumulative profitability. These core statistics hold up out-of-sample (2005–2024), maintaining strong returns despite a modest decline in average trade profitability following the original publication. In the second part of this study, we backtest a long-only trend-following portfolio specifically aimed at capturing outlier returns in individual stocks. While the theoretical portfolio exhibits exceptional gross-of-fees performance from 1991 until 2024 (e.g., a CAGR of 15.19% and an annualized alpha of 6.18%), its extensive daily turnover poses a significant challenge once transaction costs are considered. Examining net-of-fee performance across various asset under management (AUM) levels, we find that the base trend-following approach is not viable for smaller portfolios (AUM less than $1M) due to the dampening effect of trading costs. However, by incorporating a Turnover Control algorithm, we substantially mitigate these transaction cost burdens, rendering the strategy attractive across all tested portfolio sizes even after fees.
Catching Crypto Trends; A Tactical Approach for Bitcoin and Altcoins#
In recent years, cryptocurrencies have attracted significant attention from both retail traders and large institutional investors. As their involvement in digital assets grows, so does their interest in active and risk-aware investment frameworks. This paper applies a well-established trend-following methodology, successfully deployed for decades in traditional asset classes, to Bitcoin, and then extends the analysis to a comprehensive, survivorship bias-free dataset covering all cryptocurrencies traded since 2015, to evaluate whether its robustness persists in the emerging digital asset space. We propose an ensemble approach that aggregates multiple Donchian channel-based trend models, each calibrated with different lookback periods, into a single signal, as well as a volatility-based position sizing method. This model, applied to a rotational portfolio of the top 20 most liquid coins, achieved notable net-of-fees returns, with a Sharpe ratio above 1.5 and an annualized alpha of 10.8% versus Bitcoin. While assessing the impact of transaction costs, we propose a straightforward yet effective portfolio technique to mitigate these expenses. Finally, we investigate correlations between crypto-focused trend-following strategies and those applied to traditional asset classes, concluding with a discussion on how investors can execute the proposed strategy through both on-chain and off-chain implementations.
Following a Trend with an Exponential Moving Average: Analytical Results for a Gaussian Model#
We investigate how price variations of a stock are transformed into profits and losses (P&Ls) of a trend following strategy. In the frame of a Gaussian model, we derive the probability distribution of P&Ls and analyze its moments (mean, variance, skewness and kurtosis) and asymptotic behavior (quantiles). We show that the asymmetry of the distribution (with often small losses and less frequent but significant profits) is reminiscent to trend following strategies and less dependent on peculiarities of price variations. At short times, trend following strategies admit larger losses than one may anticipate from standard Gaussian estimates, while smaller losses are ensured at longer times. Simple explicit formulas characterizing the distribution of P&Ls illustrate the basic mechanisms of momentum trading, while general matrix representations can be applied to arbitrary Gaussian models. We also compute explicitly annualized risk adjusted P&L and strategy turnover to account for transaction costs. We deduce the trend following optimal timescale and its dependence on both auto-correlation level and transaction costs. Theoretical results are illustrated on the Dow Jones index.
On covariance estimation of non-synchronously observed diffusion processes#
We consider the problem of estimating the covariance of two diffusion processes when they are observed only at discrete times in a non-synchronous manner. The modern, popular approach in the literature, the realized covariance estimator, which is based on (regularly spaced) synchronous data, is problematic because the choice of regular interval size and data interpolation scheme may lead to unreliable estimation. We propose a new estimator which is free of any ‘synchronization’ processing of the original data, hence free of bias or other problems caused by it.
Optimizing the Persistence of Price Momentum: Which Trends Are Your Friends?#
The traditional wisdom that price momentum which ranks stocks’ raw trailing returns is crash-prone fails to differentiate the various drivers of stocks’ past performances. As such, we compare the persistence of different sources of stocks’ price momentum discerned from applying factor-based performance attribution to their trailing 12-month returns. Our empirical analysis shows that beta- and country-driven price trends were not robust while style and industry momentum persisted both over the intermediate and, more strongly, short-term. Stock-specific momentum persisted over the intermediate term but strongly reverted over the short term; it was subsumed as a stand-alone strategy by both industry and style momentum and should be downweighed when optimizing a momentum signal for persistence. Our results suggest that style momentum is mostly a proxy for static factor tilts while industry and stock-specific momentum appear a separate anomaly that is strongest conditional on low-volatility market regimes. Their premium may reflect investor underreaction to economic shifts to which stocks’ exposures are imperfectly captured by binary industry classifications. Our results corroborate a strand of the extant literature through the novel lens of exactly decomposing the cross-section of stocks’ price momentum; contradicting findings are explained by methodological differences.
Clustering Market Regimes Using the Wasserstein Distance#
The problem of rapid and automated detection of distinct market regimes is a topic of great interest to financial mathematicians and practitioners alike. In this paper, we outline an unsupervised learning algorithm for clustering financial time-series into a suitable number of temporal segments (market regimes). As a special case of the above, we develop a robust algorithm that automates the process of classifying market regimes. The method is robust in the sense that it does not depend on modelling assumptions of the underlying time series as our experiments with real datasets show. This method – dubbed the Wasserstein $k$-means algorithm – frames such a problem as one on the space of probability measures with finite $p^text{th}$ moment, in terms of the $p$-Wasserstein distance between (empirical) distributions. We compare our WK-means approach with a more traditional clustering algorithms by studying the so-called maximum mean discrepancy scores between, and within clusters. In both cases it is shown that the WK-means algorithm vastly outperforms all considered competitor approaches. We demonstrate the performance of all approaches both in a controlled environment on synthetic data, and on real data.
Nonlinear Time Series Momentum#
We document a persistent nonlinear relationship between price trends and risk-adjusted returns across markets and asset classes that is consistent with asset pricing theory. Nonlinearities in time series momentum are consistent with past returns reflecting information about conditional expected returns, in line with investors using conditioning information to form efficient portfolios. Machine learning techniques are useful in uncovering these relationships and yield economically and statistically significant out-of-sample improvements in time series momentum strategies.
Building Diversified Portfolios that Outperform Out-of-Sample#
This paper introduces the Hierarchical Risk Parity (HRP) approach. HRP portfolios address three major concerns of quadratic optimizers in general and Markowitz’s CLA in particular: Instability, concentration and underperformance.
HRP applies modern mathematics (graph theory and machine learning techniques) to build a diversified portfolio based on the information contained in the covariance matrix. However, unlike quadratic optimizers, HRP does not require the invertibility of the covariance matrix. In fact, HRP can compute a portfolio on an ill-degenerated or even a singular covariance matrix, an impossible feat for quadratic optimizers. Monte Carlo experiments show that HRP delivers lower out-of-sample variance than CLA, even though minimum-variance is CLA’s optimization objective. HRP also produces less risky portfolios out-of-sample compared to traditional risk parity methods.
The CoinAlg Bind: Profitability-Fairness Tradeoffs in Collective Investment Algorithms#
Collective In vestment Algorithms (CoinAlgs) are increas- ingly popular systems that deploy shared trading strate- gies for investor communities. Their goal is to democratize sophisticated—often AI-based—investing tools. We identify and demonstrate a fundamental profitability- fairness tradeoff in CoinAlgs that we call the CoinAlg Bind: CoinAlgs cannot ensure economic fairness without losing profit to arbitrage. We present a formal model of CoinAlgs, with definitions of privacy (incomplete algorithm disclosure) and economic fair- ness (value extraction by an adversarial insider). We prove two complementary results that together demonstrate the CoinAlg Bind. First, privacy in a CoinAlg is a precondition for insider attacks on economic fairness. Conversely, in a game-theoretic model, lack of privacy, i.e., transparency, enables arbitrageurs to erode the profitability of a CoinAlg. Using data from Uniswap, a decentralized exchange, we empirically study both sides of the CoinAlg Bind. We quantify the impact of arbitrage against transparent CoinAlgs. We show the risks posed by a private CoinAlg: Even low-bandwidth covert-channel information leakage enables unfair value ex- traction.
E#
This paper introduces a model-free Reinforcement Learning (RL) framework for portfolio allocation across Foreign Exchange (FX) assets, with a particular focus on carry trade strategies. The study examines whether RL-based approaches can yield distinct outcomes compared to traditional portfolio allocation techniques, such as Mean-Variance Optimization (MVO). The objective is to evaluate the performance of an RL agent in constructing a portfolio driven by FX carry signals and benchmark it against MVO. This work contributes to the literature by demonstrating the adaptability of RL to dynamic FX environments and its potential to outperform static optimization methods under varying market conditions.
Risk Beyond Volatility: A Conditional Framework for Downside Harm and Capital Loss#
Volatility remains the dominant operational proxy for risk in portfolio theory, asset pricing, and performance evaluation. Despite its widespread adoption, volatility treats upside and downside deviations symmetrically and abstracts away from the temporal and path-dependent nature of capital loss. This paper argues that these properties reflect not an economic definition of risk, but a modeling convenience rooted in early mean-variance theory.
The authors propose a conditional framework in which risk is defined as cumulative downside exposure relative to an explicit evaluation horizon and constraint set. This formulation captures both the magnitude and persistence of losses while preserving the asymmetry inherent in capital impairment. The paper shows that volatility-based metrics can misrank risk across strategies and assets exhibiting similar dispersion but substantially different drawdown dynamics.
By Ryan Nelson (The University of Tampa).
Mentioned by Ralph Sueppel in this discussion: “Paper proposes an alternative to volatility where risk is defined as cumulative downside exposure relative to an explicit evaluation horizon… It captures both the magnitude and persistence of losses while preserving the asymmetry inherent in capital impairment.”
Optimizing Liquidity Provision on Uniswap v3: A Comparative Analysis of Adaptive Strategies#
A comprehensive six-month backtesting study (April-September 2024) comparing multiple liquidity provision strategies on ETH/USDC pools in Uniswap v3. Tested approaches include constant intervals, moving averages, and dual-range allocations. The study examines capital efficiency, range width effects, and market volatility impacts, with parameter optimization across different strategy configurations. Results highlight the challenges of active liquidity management in volatile market conditions. By Zelos Research.
How Demeter Improves the Calculation of Liquidity Fees in Uniswap V3#
This post addresses the problem of fee calculation accuracy when prices cross liquidity position boundaries within a single minute. The enhanced algorithm assumes uniform price movement within one-minute intervals and allocates fees proportionally based on how many ticks the price has passed within the market-making range. This boundary crossing detection and linear price interpolation significantly improves backtesting precision for Uniswap V3 liquidity positions. By Zelos Research.
Pricing Uniswap V3 with Stochastic Process, Part 4#
A technical exposition of mathematical tools needed for pricing Uniswap V3 positions, including optimal stopping theorems, Laplace transforms, and Chapman-Kolmogorov equations. The authors establish foundations for deriving stopping time formulas that determine when liquidity positions reach price boundaries, covering martingale stopping theorem, two-boundary stopping problems, and exponential martingales. By Zelos Research.
Delta Neutral Strategy and Optimization of Uniswap V3#
Explores hedging strategies for Uniswap V3 liquidity provision using delta neutrality via AAVE borrowing. The approach divides the initial capital into two parts and uses borrowed assets to offset directional exposure while capturing fee income. The study performs backtesting to identify optimal market-making ranges through volatility-linked parameters, covering Greeks analysis, leveraged liquidity, capital allocation formulas, and volatility-adjusted range selection using the Demeter backtesting framework. By Zelos Research.
Pricing Uniswap V3 with Stochastic Process, Part 5#
Presents pricing models for Uniswap V3 positions using stochastic calculus. The work assumes geometric Brownian motion price dynamics and derives both European-style (exit only at boundaries) and American-style (exit anytime) valuation formulas. Fee collection models transition from boundary-only to continuous collection scenarios, covering optimal stopping strategies and boundary crossing problems. By Zelos Research.
An LVR Approach Proof of Guillaume Lambert’s Uniswap V3 Implied Volatility#
Demonstrates that LVR-based and Guillaume Lambert’s approaches produce identical implied volatility formulas for Uniswap V3 positions. The authors prove mathematical consistency between the two methodologies, showing both rely on similar assumptions about risk-free rates and instantaneous liquidity conditions. The proof covers LVR instantaneous loss calculations, Lambert’s IV formula, normalization approaches, and fee acquisition rates. By Zelos Research.
Implied Volatility from Uniswap V3 Liquidity Positions#
Presents methodology for calculating implied volatility in Uniswap V3 by deriving volatility perspectives from liquidity provider behaviors. The approach uses bisection methods to align theoretical option pricing with real market conditions, enabling a distribution of volatility views weighted by their liquidity’s dollar value. Covers option pricing formulas, position-level IV analysis, time series IV tracking, and weighted averaging methodology. Part 6 in the Uniswap V3 pricing series. By Zelos Research.
Uniswap v4: Insights on Performance#
A comparative performance analysis of Uniswap v4 versus v3, examining trading execution and liquidity provision metrics. The research shows that v4 trading participation has been gradually increasing and overtaking v3, and for small-to-mid size trades, v4 achieves lower levels of slippage. However, v4 maintains lower overall liquidity than v3, though fee returns are more stable. Covers hook features, trading participation metrics, slippage analysis, and fee generation stability. By Zelos Research.
Stochastic Processes and the Pricing of Uniswap V2#
Analyzes Uniswap V2 liquidity provider positions through stochastic processes, examining impermanent loss (IL) and loss versus rebalancing (LVR). The authors apply martingale stopping methods to derive pricing formulas for V2 positions, treating them as exotic options. Key findings include that the value of the V2 position is independent of volatility in their model, though they acknowledge this oversimplifies by ignoring position reconstruction costs during price swings. Covers geometric Brownian motion modeling, American perpetual option pricing, and Jensen’s inequality applications. By Zelos Research.
Are Simple Technical Trading Rules Profitable in Bitcoin Markets?#
This paper examines the profitability of simple technical trading rules in bitcoin markets comprehensively, taking into account realistic investor behavior. The study investigates whether classic technical analysis strategies such as moving average rules can generate excess returns in cryptocurrency markets, contributing to the ongoing debate about market efficiency in digital asset markets.
By Michael Frömmel and Niek Deprez, published in the International Review of Economics & Finance (2024).
Mentioned by Jungle Rock in this discussion.
Quality Minus Junk#
This paper provides a tractable valuation model that shows how stock prices should increase in their quality characteristics: profitability, growth, and safety. A “quality” security is defined as one that is safe, profitable, growing, and well managed. Empirically, the authors find that high-quality stocks do have higher prices on average but not by a large margin, and high-quality stocks have high risk-adjusted returns. A quality-minus-junk (QMJ) factor that goes long high-quality stocks and shorts low-quality stocks earns significant risk-adjusted returns in the United States and across 24 countries.
By Clifford S. Asness, Andrea Frazzini, and Lasse Heje Pedersen.
Mentioned by Kurtis The Quant in this discussion.
Episodic Factor Pricing#
This paper challenges conventional factor models by showing that factor pricing power is time varying and frequently switches between active and inactive states. The authors propose a real-time method to identify factor pricing states, showing that conditioning on these states materially improves out-of-sample multifactor portfolio performance, even after transaction costs. A conditional stochastic discount factor with state-dependent risk prices provides a better description of the investment opportunity set. Across a broad set of factors, pricing power is concentrated in active states and largely absent otherwise, implying that factor premia and risk prices are inherently episodic rather than persistent.
By Sophia Zhengzi Li, Peixuan Yuan, and Guofu Zhou.
Mentioned by Ivan Blanco in this discussion.
All Days Are Not Created Equal: Understanding Momentum by Learning to Weight Past Returns#
By flexibly weighting the information contained in past realized returns, the authors construct a momentum strategy that outperforms and subsumes the performance of traditional stock momentum. The strategy performs well in crises and continues to work in recent decades, circumventing the issue of momentum crashes. The authors show that the way past returns are weighted is consistent with the strategy exploiting an underreaction to information contained in realized returns. Earnings announcements, market-wide jumps, and large individual returns realized during the formation period are found to be most informative about future stock returns.
By Heiner Beckmeyer and Timo Wiedemann, published in the Journal of Banking and Finance (2025).
Mentioned by Ivan Blanco in this discussion.
Beat the Market: An Effective Intraday Momentum Strategy for S&P500 ETF (SPY)#
This paper investigates the profitability of a simple yet effective intraday momentum strategy applied to SPY, one of the most liquid ETFs tracking the S&P 500. Unlike academic literature that typically limits trading to the last 30 minutes of the trading session, this model initiates trend-following positions as soon as there is an indication of abnormal demand/supply imbalance in the intraday price action. The strategy introduces dynamic trailing stops to mitigate downside risks while allowing for unlimited upside potential. From 2007 to early 2024, the resulting intraday momentum portfolio achieved a total return of 1,985% (net of costs), an annualized return of 19.6%, and a Sharpe Ratio of 1.33.
By Carlo Zarattini, Andrew Aziz, and Andrea Barbon.
Mentioned by Pasta Capital in this discussion.
A Unified Framework for Anomalies based on Daily Returns#
Numerous cross-sectional equity anomalies draw on the same underlying information: the sequence of daily returns over the previous month. Using a data-driven approach, the authors estimate the empirical mapping from the distribution of last month’s daily returns to future performance without imposing functional forms. The resulting Daily Return Information Factor (DRIF) earns economically large premia, holds across subsamples and research designs, and remains significant after controlling for the modern factor zoo. DRIF subsumes most short-horizon and lottery-style anomalies and emerges as a key factor in asset pricing tests.
By Nusret Cakici, Christian Fieberg, Gabor Neszveda, Robert J. Bianchi, and Adam Zaremba.
Mentioned by Quantitativo in this discussion: “The factor zoo isn’t crowded. It’s redundant. Daily returns already contain the signal — we just kept slicing them the wrong way.”
ASRI: An Aggregated Systemic Risk Index for Cryptocurrency Markets#
This paper introduces the Aggregated Systemic Risk Index (ASRI), a composite measure comprising four weighted sub-indices: Stablecoin Concentration Risk (30%), DeFi Liquidity Risk (25%), Contagion Risk (25%), and Regulatory Opacity Risk (20%).
The framework incorporates data from DeFi Llama, Federal Reserve FRED, and on-chain analytics, and was validated against historical crises including Terra/Luna (May 2022), Celsius/3AC (June 2022), FTX (November 2022), and SVB (March 2023). Event study analysis detected statistically significant signals for all four crises with an average lead time of 18 days. A three-regime Hidden Markov Model identifies distinct Low Risk, Moderate, and Elevated states with regime persistence exceeding 94%, and out-of-sample testing on 2024-2025 data confirmed zero false positives.
The ASRI framework addresses a critical gap in risk monitoring by capturing DeFi-specific vulnerabilities—composability risk, flash loan exposure, and tokenized real-world asset linkages—that traditional systemic risk measures cannot accommodate.
By Murad Farzulla and Andrew Maksakov.
Mentioned by Saeed in this discussion.
R&D Alpha: Investment Intensity and Long-Term Stock Returns#
This paper tests whether high research and development (R&D) intensity predicts higher subsequent equity returns in a large-cap U.S. universe using methodology designed for portfolio implementability. Each year, S&P 500 stocks are sorted into quintiles by R&D intensity (R&D expense divided by revenue) and subsequent returns are evaluated with timing designed to mitigate look-ahead bias.
The high-minus-low R&D factor averages 3.73% per year, with monthly factor spanning tests confirming a statistically distinct premium (FF5 alpha = 4.37%, p < 0.01). The investable RD20 strategy, a simple long-only portfolio holding the top 20 stocks by R&D intensity equal-weighted, delivers 7.52% annual excess return versus SPY after transaction costs. The paper documents sector tilts, factor exposures, and regime dependence, noting that much of the value comes from sector tilts and regime dependence rather than a clean textbook factor.
By Abhishek Sehgal.
Mentioned by Ivan Blanco in this discussion: “Worth a read for anyone thinking seriously about intangible capital, innovation exposure, and practical factor implementation.”
Magnificent, but Not Extraordinary: Market Concentration in the US and Beyond#
This paper examines equity market concentration in the US since 1926 and in several developed markets. The authors find that current index weights of the largest firms align with historical and international patterns, and that valuation concentration moves with earnings concentration. A geometric Brownian motion benchmark with firm-specific shocks reproduces observed concentration, with idiosyncratic volatility identified as the key driver.
The central finding is that high concentration alone does not justify deviations from market weights or policy conclusions about firm breakups. The market portfolio remains optimal in the authors’ benchmark framework. Their evidence constrains pure multiple-expansion narratives and behavioral channels by linking valuations to fundamentals, pushing back on the popular narrative that the Magnificent 7 represent an unprecedented anomaly.
By Per Bye, Jens Soerlie Kvaerner, and Bas J.M. Werker.
Mentioned by Ivan Blanco in this discussion: “If you believe today’s US equity market is uniquely concentrated because of the Magnificent 7, history may disagree.”
Credit Spread News and Financial Market Risk#
This paper shows that credit spread news, defined by changes and absolute changes in corporate bond credit spreads, predict a substantial share of future variation in financial market risk. The author first documents a strong and robust predictive relationship between credit spread news and financial market risk, then investigates the economic mechanism underlying this relationship.
Both theoretical and empirical evidence highlight a central role for financial intermediaries’ risk expectations in driving the predictive power of credit spread changes. The findings establish credit spread news as statistically significant and economically meaningful predictors of financial market risk, offering a practical signal for macro-oriented systematic traders.
By Fabrizio Ghezzi.
Mentioned by Ralph Sueppel in this discussion.
Trend Following in Strategic Asset Allocation: A Long-Horizon Analysis and Retail-Oriented Implementation#
Traditional portfolio construction frameworks rely on static asset allocation and cross-asset diversification to manage risk and improve long-term outcomes. This paper investigates the role of trend following as a structural component of strategic asset allocation, rather than as a standalone return-seeking strategy. Using long-horizon historical data from 1979 to 2025, the authors examine whether systematic trend-based exposure management can complement traditional diversification by addressing risk from a different dimension: the temporal evolution of market trends.
The results suggest that incorporating trend following as a structural overlay can provide a complementary form of diversification — one based on time and regime dynamics rather than asset classes alone — potentially improving portfolio efficiency and resilience without relying on return forecasting or discretionary market timing. Simple equity trend filters such as 10-month moving averages or 12-1 momentum signals deliver comparable returns to buy-and-hold while substantially reducing maximum drawdown and improving risk-adjusted performance.
By Gabriele Galletta (Investimento Custodito).
Mentioned by Ivan Blanco in this discussion: “Trend following is not about alpha. It’s about risk control.”
A Quantitative Trading Strategy Based on A Position Management Model#
This paper establishes a quantitative trading strategy based on a position management model, applied to gold and bitcoin trading. The approach combines ARIMA time-series forecasting for price prediction with a position management framework that governs trade sizing and entry/exit rules. The authors develop differential autoregressive moving average models calibrated at different cycle times, finding that a 60-day data window produces the smallest prediction error, with the relative error of the average prediction value controlled at 0.003016. The position management model then uses these forecasts to determine optimal trade timing and allocation.
The strategy achieves an annualized rate of return of 25%, with accumulated income reaching $223,640.58 USD by September 10, 2021. Profitability and risk resistance are evaluated using Principal Component Analysis, and model validation via parameter variation confirms the solution is locally optimal and consistent with the initial parameterization. Sensitivity analysis shows that as initial commission increases or principal decreases, both trade count and returns decline gradually, confirming the model behaves as expected under parameter perturbation.
Efficient Portfolio Estimation in Large Risky Asset Universes#
This paper addresses the challenge of constructing efficient portfolios within a large investment universe composed exclusively of risky assets. The authors derive a linearly constrained regression representation of the efficient portfolio, which circumvents the need to estimate the mean vector and covariance matrix. Instead, they apply constrained sparse regression techniques (Linearly Constrained LASSO) to estimate portfolio weights directly.
The key insight is that in many real-world settings — such as institutional equity funds, emerging markets with unstable sovereign debt, or decentralized finance — a risk-free asset is unavailable. Traditional approaches like sample-based plug-in estimators, the 1/N rule, or minimum variance portfolios struggle to achieve mean-variance efficiency in large asset pools. By recasting the efficient portfolio problem as a linearly constrained regression, the authors bypass the notoriously difficult estimation of high-dimensional covariance matrices and mean vectors.
Theoretically, the authors establish asymptotic mean-variance efficiency of the estimated portfolio as both the number of assets and the sample size proportionally approach infinity. In extensive simulations and empirical studies using S&P 500 constituents with out-of-sample returns from 1981 to 2024, the method yields portfolios that satisfy specified risk levels, achieve superior Sharpe ratios, and outperform various benchmarks including equally weighted, minimum variance, and other sparse portfolio methods — both net and gross of transaction costs.
By Leheng Chen, Yingying Li, and Xinghua Zheng (Hong Kong University of Science & Technology).
Mentioned by Piotr Pomorski in this discussion.
Multiples for Valuation: Go High, Go Low, Ignore the Middle#
This paper examines whether valuation multiples such as D/P (dividend-to-price), P/E (price-to-earnings), and CAPE (cyclically adjusted P/E) can forecast stock returns, and under what conditions they are most useful. Using US data spanning 1871–2025, the author finds that multiples are far more useful at predicting forward returns when they are at relatively high or low levels than when they sit in the middle of their historical range.
The key finding is that the predictive power of valuation multiples is concentrated at the extremes. When multiples fall into the top or bottom quartile of their historical distribution, the in-sample correlation with subsequent approximately ten-year returns is substantially higher, with R² reaching up to 0.70. Out-of-sample forecasts generated from extreme multiples also significantly outperform those derived from mid-range multiples. The practical implication is that investors should pay close attention to valuations when they are unusually stretched in either direction, but can largely ignore them when they are near the middle of their historical range.
By Javier Estrada (IESE Business School).
Mentioned by Ivan Blanco in this discussion: “Do multiples predict returns? Valuations Only Matter at Extremes.”
Covariance Implied Risk Factors#
This paper examines the role of heteroskedasticity in extracting latent risk factors from asset returns. Standard principal component analysis (PCA) suffers from distortions when assets exhibit heterogeneous idiosyncratic variances, causing estimated factors to reflect clusters of idiosyncratic risk rather than true systematic risk. The author applies heteroskedastic PCA (heteroPCA) to correct for this bias by iteratively replacing the diagonal of the sample covariance matrix with estimates implied by the off-diagonal structure.
HeteroPCA delivers substantially better out-of-sample cross-sectional pricing performance compared to standard PCA across multiple equity portfolio sorts. The identified factors exhibit clearer economic interpretability, and the implied stochastic discount factor achieves lower Hansen-Jagannathan distances. The method trades off slightly worse time-series fit for much stronger cross-sectional pricing power, a tradeoff the author argues is economically favorable.
Key results: On AP-Tree portfolios, heteroPCA achieves out-of-sample Sharpe ratios of 0.46 (Tree10) and 0.55 (Tree40), compared to 0.18 and 0.26 from standard PCA. Across double-sorted portfolios, heteroPCA consistently outperforms: Size & Book-to-Market Sharpe ratio 0.28 vs 0.15, Size & Accruals 0.21 vs 0.13, Size & Investment 0.32 vs 0.20, and Size & Idiosyncratic Volatility 0.35 vs 0.21. Sharpe ratio gains often exceed 50% relative to standard PCA. RMS pricing errors are also lower, with heteroPCA reducing RMS alpha from 0.85-0.90 to 0.72-0.80 on AP-Tree portfolios.
By Mohammed Mehdi Kaebi (Insper Institute of Education and Research).
Mentioned by Ivan Blanco in this discussion: “Your PCA might be lying to you. Standard PCA distorts latent factors when assets have different idiosyncratic variances. The fix? Heteroskedastic PCA.”
Asset Allocation: From Markowitz to Deep Reinforcement Learning#
This paper benchmarks nine asset allocation strategies spanning traditional Modern Portfolio Theory and deep reinforcement learning. The traditional methods include the tangency portfolio, minimum variance, risk parity, and equal weight. The DRL methods include A2C, PPO, DDPG, SAC, and TD3. Each strategy is evaluated across both bullish and bearish market environments using real stock data.
Traditional MPT-based approaches deliver stable, consistent results without requiring a training phase. The tangency portfolio achieves the highest Sharpe and Calmar ratios across scenarios. DRL agents can surpass traditional methods in bull markets at their best (SAC achieved 179% annual return and a 2.58 Sharpe ratio), but exhibit high variance across runs due to stochastic optimization. In their worst runs, DRL agents fail to outperform the simple equal weight baseline. In bear markets the performance gap between traditional and DRL approaches narrows substantially, and DRL results become less reliable. The author suggests that more training data, additional technical indicators, and architectures like transformers could help stabilize DRL performance.
By Ricard Durall (Open University of Catalonia).
Mentioned by Jungle Rock in this discussion.
Order Flow and Exchange Rate Dynamics#
This paper challenges traditional macroeconomic models of exchange rate determination, which typically produce poor forecasting results. Instead, the authors introduce a microstructure-based approach incorporating order flow — the net of buyer-initiated and seller-initiated trades — as a key determinant of exchange rate movements. Their model achieves R² statistics above 50% for daily exchange-rate changes in the DM/$ spot market, vastly outperforming standard macro models and providing superior short-horizon forecasts compared to random walk models.
The central empirical result is that $1 billion of net dollar purchases increases the DM price of a dollar by about 1 pfennig. This finding established order flow as a primary driver of FX price formation and has become a foundational reference for understanding flow-driven dynamics in other asset classes, including cryptocurrencies.
By Martin D.D. Evans and Richard K. Lyons. Published in Journal of Political Economy (2002), vol 110(1), pp 170-180.
Trading and Arbitrage in Cryptocurrency Markets#
This paper studies the efficiency and price formation of cryptocurrency markets. The authors document large, recurrent arbitrage opportunities across exchanges, with price deviations much larger across countries than within countries, highlighting the role of capital controls in limiting arbitrage capital flows. The common component of signed volume on each exchange explains approximately 80% of Bitcoin returns, while idiosyncratic components help explain arbitrage spreads between exchanges.
Price deviations across countries co-move and widen during periods of large Bitcoin appreciation, with countries exhibiting higher Bitcoin premia over the US price seeing the largest arbitrage deviations. The paper provides foundational evidence that Bitcoin’s price formation is heavily flow-driven, resembling foreign exchange markets more than equities.
By Igor Makarov and Antoinette Schoar. Published in Journal of Financial Economics (2020), vol 135, issue 2, pp 293-319.
Order Flow and Cryptocurrency Returns#
This paper investigates the role of order flow in explaining and predicting cryptocurrency returns. The authors construct an aggregate “world” order flow measure from 11 major fiat currencies across multiple exchanges and test its explanatory power on a cross-section of 82 cryptocurrencies using panel regressions with controls for established return predictors.
World order flow explains approximately 11% of daily and 20% of weekly cross-sectional cryptocurrency returns, with strong evidence of permanent price impact — indicating that order flow carries genuine information rather than transient liquidity effects. The paper demonstrates that order flow dominates macroeconomic fundamentals for out-of-sample prediction, especially when using non-linear machine learning models. Long-short portfolios constructed from ML forecasts conditioning on daily order flow achieve an alpha of up to 0.79% per day with an annualized Sharpe ratio of 3.63.
By Alexia Anastasopoulos, Nikola Gradojevic, Jiaao Liu, Alex Maynard, and Christos Tsiakas. Published in Journal of Financial Markets (2026).
Risk-adjusted Momentum Strategy Construction and Industry Heterogeneity Analysis Based on STARR Indicator#
This paper proposes a risk-adjusted momentum strategy using the STARR (Stable Tail-adjusted Return Ratio) indicator — a metric that replaces standard deviation in the Sharpe ratio with Conditional Value at Risk (CVaR) to better capture downside tail risk. The study constructs both Sharpe-based and STARR-based momentum factors across industry-level ETFs from the S&P 500 and Nikkei 225, applying mean-variance optimisation to monthly data spanning 2010–2025.
The STARR-based strategy demonstrates superior downside risk control compared to conventional Sharpe-based momentum, particularly during extreme market conditions such as the COVID-19 crisis. Performance varies significantly across sectors and volatility regimes, confirming meaningful industry heterogeneity in momentum returns. The strategy maintains robustness through alternative parameter configurations and cross-market validation between U.S. and Japanese equities, suggesting that incorporating downside-sensitive metrics like CVaR into momentum signal construction can enhance risk-adjusted returns and improve portfolio stability in diverse market environments.
By Xupeng Zhang (McGill University).
Scale Dependent Dynamics in Equity Market Phase Space#
This paper extends the stochastic damped harmonic oscillator (SDHO) framework to longer timescales and investigates why the transition from mean-reversion to momentum occurs at the 2-3 month horizon where the Jegadeesh-Titman momentum anomaly begins. Using phase-space analysis on 32 years of SPY daily prices, 154 years of Shiller S&P 500 data, and 308 years of Bank of England UK equity data, the authors show that mean-reversion strength decays as a power law with observation horizon, following k(tau) = A * tau^(-alpha).
The paper identifies the Voronoi-Delaunay tessellation as mathematically equivalent to the finite-volume discretization of the Fokker-Planck equation on irregular phase-space data, connecting the empirical dynamics to non-equilibrium statistical mechanics. A 2x2x2 factorial analysis on both SPY daily data and ES futures minute data (5.38 million observations) resolves an apparent R-squared discrepancy between overlapping and non-overlapping sampling, finding that the sampling effect accounts for 92-97% of the apparent improvement across six futures markets spanning five asset classes. A null-model test shows that the power-law decay arises from temporal coarse-graining of stationary SDHO dynamics rather than from scale-dependent coupling constants, though the practical implications for momentum and mean-reversion strategies remain the same.
Key metrics: the mean-reversion coefficient follows k(tau) = 1.61 * tau^(-1.13) with R-squared = 0.99 across 34 measurements spanning 462 combined years of data. Mean-reversion remains strictly positive (k > 0 in 34/34 measurements) but decays to negligible levels at the momentum horizon. All six futures markets tested cluster within a narrow parameter band (R-squared in [0.56, 0.60], omega in [1.11, 1.22]), suggesting a common dynamical equilibrium across liquid markets.
Mentioned by Symplectic.Research in this discussion.
By Bruce H. Dean (Independent Researcher), March 2026.
Vault Allocation Strategies#
This post examines the quantitative limitations of current crypto vault allocation strategies in DeFi. The author argues that most vault strategies marketed as “yield optimization” are in practice rule-of-thumb routing engines rather than proper quantitative portfolio construction. The analysis identifies five structural weaknesses: absence of formal optimization frameworks (no mean-variance or drawdown minimization, just “higher APY wins”), weak risk modeling relying on caps and allowlists rather than distributions and tail scenarios, ignored correlation across risk factors, static or reactive allocation with little predictive signal, and under-modeled liquidity where withdrawal queues are not priced into allocation decisions.
The post concludes that current vaults are constrained routing engines with basic risk controls sitting on top of structural yield sources (emissions, lending demand, structural carry), not quantitative funds. The author proposes that the next generation of vaults should incorporate proper objective functions, factor-aware allocation, correlation modeling, and liquidity-adjusted returns to move from the “rules-based allocator” phase to a true “quant portfolio” phase.
No quantitative backtest metrics are presented — the post is a qualitative framework analysis rather than an empirical study with performance results.
Mentioned by Amy O. Khaldoun (Vess3l, Quant Consultant) in this LinkedIn post.
Dynamic Factor Allocation via Momentum-Based Regime Switching#
This paper presents a systematic framework for dynamically allocating across five equity factors — Value, Size, Momentum, Quality, and Growth — using a momentum-based regime switching model. The authors use a z-score normalization approach with only two hyperparameters to classify Bull and Bear regimes for each factor through normalized trend signals. Regime identification is statistically significant across all factors, with Size and Growth showing strong significance (p < 0.01). The key insight is asymmetric: Bull regimes exhibit systematic positive forward returns, while Bear regimes show no significant pattern, supporting an approach that overweights factors in Bull regimes and underweights or avoids those in Bear regimes.
The factor timing strategy is validated through ETF-based backtesting over the 1998–2025 period, demonstrating practical implementability with only modest performance degradation from tracking error and expense ratios. The framework uses a Black-Litterman-style integration to combine regime signals with portfolio construction.
Key metrics: the strategy achieves a Sharpe ratio of 0.66 compared to 0.59 for an equal-weighted benchmark, with annualized returns of 13.0% versus 11.3%. The calendar year hit rate is 78.6%, with turnover of approximately 3x annually. Cross-sectional validation confirms Bull regime significance (t = 1.98, p = 0.047).
By Jim Tai, Stephanie Leung, and Justin Jimenez (StashAway), February 2026.
AlgoXpert Alpha Research Framework: A Rigorous IS WFA OOS Protocol for Mitigating Overfitting in Quantitative Strategies#
Transitioning a strategy from backtest to live trading is a common failure point for quantitative systems due to parameter overfitting, selection bias, and sensitivity to regime changes. This paper presents the AlgoXpert Alpha Research Framework, a standardised three-stage evaluation protocol: In-Sample (IS) analysis that targets stable parameter regions rather than single optima; Walk-Forward Analysis (WFA) with rolling windows and purge gaps to prevent information leakage, governed by majority-pass and catastrophic-veto rules; and Out-of-Sample (OOS) testing under strict parameter lock with no further tuning.
The framework applies a defense-in-depth structure with three layers: structural safeguards (cliff veto), execution controls (spread and leverage guards), and equity protection mechanisms (circuit breakers and a kill switch). A case study on USDJPY M5 intraday data demonstrates how to detect overfitting through performance decay and drawdown behaviour across chronological stages. A post-validation comparison of four alpha variants (v1–v4) reveals rank reversal when the optimisation objective changes from maximising Sharpe ratio to minimising maximum drawdown — illustrating the fundamental trade-off between risk-adjusted performance and tail risk control.
Key metrics: The paper compares four alpha variants in a USDJPY M5 case study. Rank reversal between variants is observed when switching the objective from Sharpe maximisation to max drawdown minimisation. No single aggregate Sharpe/return figure is presented; the framework is methodological rather than a performance report of a specific strategy.
Mentioned by QFinancePapers in this discussion.
By The Anh Pham, Bao Chan Nguyen, and Nguyet Nguyen Thi, March 2026.
Optimizing Portfolio Performance through Clustering and Sharpe Ratio-Based Optimization: A Comparative Backtesting Approach#
This paper introduces a two-stage portfolio construction framework that combines K-Means clustering with Sharpe ratio-based weight optimisation. In the first stage, a universe of financial assets is segmented into clusters based on historical log-returns, grouping together assets with similar return characteristics. In the second stage, each cluster is treated as a sub-portfolio and Sharpe ratio maximisation is applied within each cluster to derive optimal weights — directly incorporating the return/volatility trade-off rather than relying on mean-variance optimisation’s sensitivity to expected return estimates.
The framework is evaluated through a backtesting study across multiple asset classes, comparing the cumulative returns of optimised per-cluster portfolios against a traditional equal-weighted benchmark. The approach allows targeted portfolio construction within homogeneous asset groups while maintaining diversification across clusters.
Key metrics: Optimised portfolios outperform the equal-weighted benchmark in cumulative return terms across the backtest period. Specific Sharpe ratio, max drawdown, and annualised return figures are not reported in the abstract; the primary comparison is cumulative return trajectories per cluster versus the equal-weighted baseline.
Mentioned by QFinancePapers in this discussion.
By Keon Vin Park, January 2025.
The Sharpe Stability Ratio: Temporal Consistency of Risk-Adjusted Performance#
This paper introduces the Sharpe Stability Ratio (SSR), a performance metric that evaluates the temporal consistency of risk-adjusted returns. While the standard Sharpe ratio summarizes average excess return per unit of risk over a fixed sample, it cannot distinguish persistent skill from episodic outperformance — two strategies may display identical ex-post Sharpe ratios yet differ sharply in their temporal profiles. SSR addresses this gap by treating the rolling Sharpe ratio as a time-series object and defining stability as the ratio of mean rolling performance to its heteroskedasticity-and-autocorrelation-consistent (HAC) standard deviation.
The paper demonstrates four practical applications of SSR. First, it reveals that strategies with similar point-in-time Sharpe ratio and Probabilistic Sharpe Ratio (PSR) can exhibit markedly different stability profiles, providing critical information for due diligence and manager selection. Second, it quantifies the strong serial correlation induced by overlapping rolling windows, showing that naive dispersion measures severely understate uncertainty and that HAC correction is indispensable for valid temporal inference. Third, SSR supports formal hypothesis testing on stability via block bootstrap procedures that preserve the dependence structure of returns. Fourth, it demonstrates that statistically credible aggregate performance (PSR close to one) does not guarantee temporal consistency — high average Sharpe ratios may be generated by concentrated episodic strength rather than sustained skill.
Key metrics: This is a methodological paper introducing a new risk metric rather than a backtested trading strategy. No annualised return, Sharpe ratio, or drawdown figures are reported for a specific strategy. The primary empirical validation uses controlled simulations and hedge fund index data to show that SSR delivers complementary insights relative to SR and PSR, successfully separating genuinely stable performance from volatile profiles that appear credible under static evaluation. The paper is 63 pages and covers formal derivations, Monte Carlo experiments, and empirical applications across hedge fund styles.
By Mario Bajo Traver (Bank of Spain) and Alejandro Rodriguez Dominguez (University of Reading; Miralta Finance Bank S.A.), January 2026.
Mentioned by Piotr Pomorski in this discussion: “Something on sharpe ratio for temporal stability.”
Cross-Sectional Skewness#
This paper investigates what distribution best characterizes the time series and cross section of individual stock returns. The authors estimate the degree of cross-sectional return skewness relative to a flexible benchmark that nests many standard models in the literature, including lognormal, stochastic volatility, and jump-diffusion specifications. The central finding is a striking asymmetry across horizons: cross-sectional skewness in monthly returns far exceeds what the benchmark model predicts, while cross-sectional skewness in long-run returns is substantially below what the model predicts.
The resolution lies in fat-tailed idiosyncratic events — rare, firm-specific jumps that generate extreme short-term returns but whose effects attenuate over longer holding periods. The authors show that incorporating power-law-distributed idiosyncratic shocks into the return-generating process is necessary to reconcile both the excess short-horizon skewness and the deficit of long-horizon skewness observed in the data. This has direct implications for asset pricing and portfolio construction: models that ignore fat-tailed idiosyncratic risk will systematically misprice lottery-like payoffs at short horizons and underestimate diversification benefits at longer horizons.
Key metrics: This is a theoretical and empirical asset pricing paper rather than a backtested trading strategy. The primary quantitative results concern the magnitude of cross-sectional skewness relative to model predictions across different return horizons. No annualised return, Sharpe ratio, or drawdown figures are reported. The paper’s contribution is to the understanding of return distributions and the role of rare idiosyncratic events in shaping cross-sectional risk.
By Sangmin Oh (Columbia Business School) and Jessica A. Wachter (Wharton / NBER / SEC). Published in Review of Asset Pricing Studies 12(1):155-198, March 2022.
Mentioned by Darren (@ReformedTrader) in this discussion.
Beyond delta neutrality: Confidence-scaled hedging with machine learning forecasts#
This paper studies whether machine learning forecasts can enhance option portfolio performance by relaxing strict delta neutrality. The authors propose a confidence-scaled hedging framework that dynamically adjusts hedge ratios based on the classification confidence of ML models. Rather than targeting a delta of zero, the framework applies a parameter α as a multiplier to the hedge ratio: when delta is positive and the model anticipates an upward move, the hedge is reduced to retain more directional exposure; when delta is negative and a decline is predicted, the hedge magnitude is similarly reduced. This creates a mechanism for translating model confidence directly into portfolio positioning.
The study uses option and underlying ETF data to evaluate the framework empirically. The key finding is that moderate confidence scaling improves Sharpe ratios relative to a strict delta-neutral benchmark, while aggressive scaling increases volatility and weakens long-term returns. The results demonstrate that ML forecasts can be translated into economically meaningful improvements in risk-adjusted performance, provided the confidence scaling parameter is calibrated carefully.
Key metrics: moderate confidence scaling improves Sharpe ratios versus the delta-neutral benchmark; aggressive scaling increases portfolio volatility and degrades long-term returns. The framework introduces a single tunable parameter α that governs the trade-off between return enhancement and increased risk exposure. No specific numeric Sharpe or return figures are disclosed in the abstract.
Mentioned by Nam Nguyen, Ph.D. (Quantitative Strategist and Derivatives Specialist) in this LinkedIn discussion.
By Boyan Li and Chongfeng Wu. Published in Finance Research Letters, 2025.
Advanced Signal Filtering for Mean Reversion Trading#
This paper develops a regime-aware mean-reversion trading framework in which adaptive neural signal filters determine the latent fair price of an asset. The core idea is that the spot price is driven by high-frequency noise around a smooth “fair value”, and the spread between the two creates buy/sell opportunities. The authors introduce a novel optimisation objective — the Local Average Filtering Objective (LAFO) — which interpolates between pointwise fitting and global averaging, yielding a controllable low-pass filter. Penalty terms restore identifiability, encode structural regime assumptions, and stabilise the extracted signal. Modern neural network architectures — including recurrent, convolutional, and state-space models — are shown to approximate solutions within this framework.
On S&P 500 intraday data (2-minute frequency), the neural filters dramatically outperform traditional EMA baselines. The best-performing model, a 2-layer CNN, achieved an annualised Sharpe Ratio of 11.05, a Sortino Ratio of 39.13, a hit rate of 47.9%, and positive excess returns of 10.5% after transaction costs of 3 bps. WaveNet (Sharpe 8.01, Sortino 24.17) and a Deep Kalman Filter (Sharpe 4.92, Sortino 12.89) also produced strong risk-adjusted returns, while standard EMA filters yielded deeply negative Sharpe ratios. The results demonstrate that appropriate signal extraction is central to mean-reversion profitability.
Mentioned by Piotr Pomorski (@PtrPomorski) in this discussion.
By Zhichen Xu, Nick Firoozye, Andreas Koukorinis, Philip Treleaven, and Wilbur Zhu. Department of Computer Science, University College London.
Dynamic Black-Litterman#
This paper generalises the classical Black-Litterman portfolio allocation model from a single-period to a continuous-time dynamic setting, allowing investors to trade continuously while incorporating expert views whose horizons may differ from the investment horizon. Using Bayesian graphical models, the authors derive the conditional distribution of asset returns under geometric Brownian motion, showing that conditioning on forward-looking views transforms the log-returns process from Brownian motion with drift into a mean-reverting process expressible as a multi-dimensional Brownian bridge. The conditional price process becomes an affine factor model where the conditional log-returns serve as factors, and explicit formulas are provided for the optimal dynamic investment policy via Hamilton-Jacobi-Bellman equations solved in closed form through a system of Riccati ODEs.
In Monte Carlo simulations on five S&P/ASX 50 assets with CAPM-derived expected returns and three forward-looking views over a one-year horizon, the Dynamic Black-Litterman (DBL) investor consistently dominates the Rebalanced Classical Black-Litterman (RCBL) investor across all rebalancing frequencies (continuous, daily, weekly, monthly). The efficient frontier generated by DBL lies strictly above RCBL, with the performance gap widening when expert views are more precise (lower view noise). DBL also achieves substantially lower portfolio turnover across all risk aversion levels, because its hedging demand offsets sensitivity to changes in the views covariate. Crucially, DBL performance is insensitive to rebalancing frequency, while RCBL deteriorates rapidly as the rebalancing interval increases. The paper further shows that anticipating view revisions (quarterly or mid-horizon updates) materially improves certainty equivalent returns.
By Anas Abdelhakmi and Andrew Lim. National University of Singapore.
Expected Returns with Cash Flow Trends and Cycles#
This paper argues that cash flow growth contains a permanent trend component, rendering the price-dividend ratio a noisy proxy for expected returns. Using an Extended Kalman Filter to jointly estimate trend growth, cash flow cycles, and expected returns, the authors demonstrate that cash flow noise generates severe attenuation bias in standard predictive regressions — making valuation ratios appear far weaker than they truly are.
By decomposing the price-dividend ratio into discount-rate variation (the return-predicting signal) and cash-flow dynamics (the noise), and then purging this noise, return predictability is substantially restored. Discount-rate variation dominates price-dividend movements at short and medium horizons, while trend growth dominates at longer horizons.
The cleaned signal delivers out-of-sample R² of approximately 9% at the one-year horizon and 22% at the five-year horizon, where traditional valuation ratios uniformly fall short at every tested horizon.
Mentioned by Ivan Blanco in this LinkedIn discussion: “Valuation ratios aren’t weak predictors. They’re noisy ones. There’s a difference.”
By Sebastian Hillenbrand and Odhrain McCarthy.
Crypto Contagion#
An empirical study of how the relationship between cryptocurrencies and US equity markets underwent a fundamental structural shift following the introduction of crypto ETFs. Using a risk-sharing model of crypto contagion combined with jump diffusion analysis and double machine learning, the authors isolate actual shock transmission between crypto and equity markets — not merely rolling correlations — across the pre- and post-ETF regimes for Bitcoin (BTC-USD) and Ethereum (ETH-USD).
Before ETF launch, the main cryptocurrencies moved in the opposite direction from US market returns: a 1% crypto move corresponded to approximately -0.07% in S&P 500 returns, making crypto a standalone risk factor with genuine portfolio diversification benefits. After ETF introduction, this relationship inverted entirely — cryptocurrencies now move in tandem with equity markets, functioning as high-beta equity exposure. The diversification benefit that previously justified crypto allocations in multi-asset portfolios is largely eliminated. The authors attribute this regime shift to a fundamental change in the investor base and information environment: crypto ETFs aggregate focused information about cryptocurrency innovations within the protections of the US regulatory framework, attracting institutional investors who previously used spot crypto markets for price discovery. The cryptocurrencies themselves are consequently evolving toward entities comparable to ordinary corporations, with ETFs serving as a proxy for investor sentiment about blockchain innovation.
The regime-shift finding has significant implications for systematic portfolio construction: crypto should no longer be modelled as an independent risk factor but as leveraged equity exposure. The pre-ETF negative beta with US equities — the source of the diversification case — has not just weakened but structurally reversed.
Mentioned by Ivan Blanco in this LinkedIn discussion: “For a systematic multi-asset portfolio, the data suggest crypto is no longer a genuine diversifier; it looks like high-beta equity.”
By Irene Aldridge and Wenke Du.
Flow Toxicity and Liquidity in a High Frequency World#
Order flow is toxic when it adversely selects market makers, who may be unaware they are providing liquidity at a loss. This paper by David Easley, Marcos López de Prado, and Maureen O’Hara introduces the Volume-Synchronized Probability of Informed Trading (VPIN) — a real-time, high-frequency metric for estimating the probability that order flow is driven by informed participants. Unlike the earlier PIN model, which requires maximum-likelihood estimation of unobservable parameters, VPIN operates in volume time rather than clock time and uses a novel Bulk Volume Classification (BVC) procedure to classify trade volume as buy- or sell-initiated without requiring tick-level quote data. The metric is updated continuously as volume buckets fill, making it directly applicable to high-frequency and algorithmic trading environments.
The paper’s central empirical contribution is the application of VPIN to the May 6, 2010 Flash Crash. The authors demonstrate that the cumulative distribution function of VPIN (CDF(VPIN)) reached its 0.97 threshold more than one hour before the crash, signalling historically elevated order flow toxicity well before the market collapsed. As informed traders concentrated activity on the sell side, market makers faced severe adverse selection. The resulting withdrawal of liquidity amplified the crash dynamics. Beyond the Flash Crash, the authors show that VPIN serves as a useful short-term indicator of toxicity-induced volatility. Practitioner-calibrated thresholds place sustained readings above 0.85 (90th–95th percentile) as operationally significant for liquidity risk monitoring. Subsequent debate — notably by Andersen and Bondarenko (2014) — has questioned the precise timing and predictive power of VPIN around the Flash Crash, arguing that elevated readings occurred partly after the event and reflect a mechanical relationship with trading intensity. The original authors have responded and refined their analysis, but the controversy underscores the importance of understanding VPIN’s assumptions before deploying it in production.
Published in the Review of Financial Studies, Vol. 25, No. 5, pp. 1457–1493, 2012.
Mentioned by 0xAlcibiades in this discussion: “In 2010 Easley, Lopez de Prado, and O’Hara built a way to measure this. VPIN.”
Geopolitical Risk in Currency Markets#
This paper by Alessandro Melone (Ohio State University) and Andreas Stathopoulos (UNC Chapel Hill) documents that geopolitical risk is a priced factor in the cross-section of currency returns. The authors sort currencies into five portfolios based on the rolling forecast coefficient of their excess returns on Caldara and Iacoviello’s (2022) Global Geopolitical Threats (GPT) index. The resulting long-short strategy, GHML (Geopolitical High Minus Low), invests in currencies with high geopolitical exposure and shorts those with low exposure. GHML is nearly orthogonal to standard currency risk factors (DOL and CAR), representing a genuinely new source of systematic variation in FX markets.
The paper further links geopolitical risk factor loadings to international capital flows: countries that attract higher-than-average net capital flows during geopolitical turmoil have currencies that hedge geopolitical risk (“safe havens”), while countries experiencing capital outflows have geopolitically risky currencies (“danger zones”). Safe haven currencies carry a GHML loading 0.32 lower than danger zone currencies. Importantly, geopolitical factor loadings are unrelated to policy uncertainty measures like the Economic Policy Uncertainty (EPU) index or the Trade Policy Uncertainty (TPU) index, establishing geopolitical risk as a distinct phenomenon.
Key metrics: The GHML strategy yields an annualised excess return of 3.28%, with an annualised standard deviation of 8.37% and a Sharpe ratio of 0.39. After controlling for dollar (DOL) and carry (CAR) factors, the annualised alpha is 2.76%, with the regression adjusted R-squared for GHML at just 2% — confirming the factor is essentially unexplained by existing currency risk models. Portfolio GHML loadings increase monotonically from -0.49 (Portfolio 1) to 0.51 (Portfolio 5). The sample covers 41 developed and emerging market currencies from February 1988 to December 2024 (443 monthly observations).
Mentioned by Ivan Blanco in this LinkedIn discussion: “New research: geopolitical risk is a priced factor in the cross-section of currency returns.”
Dynamic Mean-Variance Portfolio Allocation under Regime-Switching Jump-Diffusions with Absorbing Barriers and Distribution Matching#
This paper by Artur Sepp (LGT Private Banking) studies continuous-time dynamic mean-variance portfolio allocation when the risky asset follows a two-state regime-switching process with exponentially distributed jumps at regime transitions. The value function admits a regime-conditional quadratic form whose coupled nonlinear Riccati ODEs have a unique global solution. The optimal feedback policy retains the Merton-Lipton structure — the regime-dependent effective Merton ratio modulated by a goal-seeking funding ratio — in each regime. An absorbing wealth floor for capital preservation is introduced via a Laplace transform framework employing Arrow-Debreu state prices. The stopped terminal wealth distribution decomposes into three components absent in pure diffusion models: survival paths, a floor atom (diffusion-hitting), and jump-overshoot paths below the floor. The paper also formulates the inverse problem of matching a target terminal wealth distribution using mixtures of stopped strategies across asset classes, showing that the family of achievable distributions is dense in the Wasserstein-1 topology.
The framework is calibrated to a three-asset-class portfolio of bonds, equity, and private equity under capital market assumptions with one crash per decade and mean stress duration of one year. Crash losses are 8% for bonds, 25% for equity, and 30% for private equity. Four mandate profiles are evaluated over a 10-year horizon with a 2-sigma drawdown tolerance. For the balanced mandate (35/43/22 weights), the MV-optimal strategy achieves an implied annualised return of 3.30% with terminal wealth standard deviation of 46.1 and survival probability of 78.7%, compared to a buy-and-hold benchmark return of 4.10% with standard deviation of 74.8 — the floor protection cost is 80 bp per annum. For the growth mandate (0/67/33), implied return is 3.68% versus 5.01% buy-and-hold, with survival at 73.8% and floor protection cost of 117 bp. The MV strategy produces endogenous de-risking glide paths: all mandates start fully invested and de-risk over time, with the growth mandate declining from 100% to 37% risky allocation over 10 years. Conditional on survival, terminal wealth volatility drops sharply (31.5 for growth versus 117.6 under buy-and-hold). All results are computed analytically via Laplace inversion in approximately one second per evaluation. A companion Python package GoalBasedAllocation is available at https://github.com/ArturSepp/GoalBasedAllocation.
Mentioned by Artur Sepp in this discussion: “A wealth floor is a barrier. A market crash is a jump. Result: analytical terminal wealth densities, floor protection costs & de-risking glide paths — all analytic.”
Optimal Make-Take Fees in a Multi Market Maker Environment#
This paper by Bastien Baldacci (Ecole Polytechnique), Dylan Possamai (ETH Zurich), and Mathieu Rosenbaum (Ecole Polytechnique) studies how an exchange should optimally design make-take fee contracts when multiple market makers compete on its platform. Using a principal-agent framework, the exchange (principal) offers fee contracts to N market makers (agents) who then set their bid-ask quotes in a Nash equilibrium. The exchange’s revenue depends on the arrival rate of market orders, which is driven by the aggregate liquidity — the weighted sum of all market makers’ spreads relative to an efficient price. The problem is solved as a two-step Stackelberg game: first, each market maker computes best-response spread strategies given others’ quotes, yielding a Nash equilibrium characterised via a multidimensional system of BSDEs; second, the exchange solves a Hamilton-Jacobi-Bellman equation to find the optimal contract given these equilibrium responses.
The optimal contract is expressed in quasi-explicit form as a sum of stochastic integrals with respect to market order and efficient price processes, making it readily implementable. The contract is indexed only on aggregated order processes and the efficient price — the platform does not need to monitor cross-incentives between individual market makers. Each market maker’s compensation consists of: a base payment calibrated to their reservation utility, per-trade rebates for executed orders (the “make” component), a risk-adjustment term for inventory and price exposure, and a deduction for the certain gains the agent would extract from their own optimisation. Crucially, the exchange can discriminate between market makers through the price-risk component based on their individual risk aversion parameters, while keeping the order-flow-based incentives uniform.
A key counterintuitive finding is that increasing the number of market makers does not necessarily improve the exchange’s outcome. The function S(N) measuring platform PnL as a function of the number of market makers has a unique global maximum — numerically found at approximately N=3 for the paper’s calibration. Beyond this optimum, adding market makers with comparable risk aversion actually decreases order arrival intensity, increases average spreads, and reduces platform profitability, because each maker receives a shrinking share of incentives (described as “a cake whose size increases slower than the number of people who eat it”). However, adding a market maker with higher risk aversion always reduces the average spread, while adding one with lower risk aversion increases the principal’s capacity to offer incentives. Decreasing the taker cost increases platform PnL up to a point, while increasing the weight parameter reduces the best bid-ask spread and increases total order flow and PnL. In the limit as N goes to infinity, the optimal spread converges to the no-contract spread form but with a different value function, and the incentive terms vanish.
Mentioned by Alcibiades in this discussion: Alcibiades frames exchange design as a principal-agent problem where successful venues must balance maker and taker incentives through carefully structured fee mechanisms. He argues that naive rebate rules create synchronised price patterns across makers, latency races concentrate rents among wealthy incumbents, and points programs are unsustainable bribery. The optimal design uses speedbump access to the book, charges takers a fee, pays a portion back to makers as a rebate per dollar of volume, and moves the two in lockstep — so that makers remain viable when their rebate-per-dollar exceeds adverse selection costs, and takers participate when execution costs fall below their alpha or hedging threshold. Hyperliquid (200ms blocks with tiered taker fees) and Lighter (variable latency speedbumps) are cited as real-world examples of this balanced approach.
A Tale of Two Anomalies: The Implications of Investor Attention for Price and Earnings Momentum#
This paper by Kewei Hou (Ohio State University), Roger K. Loh (Nanyang Technological University), Lin Peng (City University of New York), and Wei Xiong (Princeton University and NBER) investigates how investor attention plays opposing roles in two major stock market anomalies: price momentum and earnings momentum. Using a comprehensive set of nine attention proxies — including trading volume (Turnover), Bloomberg terminal readership (Bloomberg AIA), SEC EDGAR downloads, social media activity (StockTwits, Tweets, Social Attention), Google search volume, news article counts, and institutional distraction — the authors construct a Composite Attention measure and test its interaction with momentum strategies using two-way portfolio sorts on CRSP common stocks from January 1967 to December 2022.
The central finding is that attention acts as a double-edged sword: it amplifies overreaction-driven price momentum while dampening underreaction-driven earnings momentum. When investors pay less attention to earnings news, stock prices underreact, generating stronger post-earnings-announcement drift. When attention is high, behavioural biases such as overconfidence and trend extrapolation intensify, fuelling stronger price momentum. The paper also documents important heterogeneity across attention channels: institutional attention (Bloomberg AIA) reduces earnings momentum without amplifying price momentum, consistent with rational information processing, whereas retail-oriented and social media attention strongly amplifies price momentum through behavioural bias channels. Interestingly, social media attention also attenuates earnings momentum, suggesting that even retail participants contribute to price discovery when focusing on fundamental signals. High-attention stocks exhibit stronger long-term price reversals after months 7-8, confirming the overreaction interpretation, while earnings momentum for low-attention stocks shows no reversal over 24 months.
Key metrics from the portfolio analysis: using Composite Attention, high-attention stocks generate price momentum excess returns of 47-74 bps/month more than low-attention stocks across factor models (rising to 64-91 bps/month with orthogonalised measures). Conversely, earnings momentum profits are 99-119 bps/month stronger for low-attention stocks. With Turnover as the attention proxy, the High-Low price momentum spread is 135 bps/month in excess returns and 80-102 bps/month in factor alphas, while earnings momentum spreads are -88 bps excess and -100 to -119 bps in alphas. Social media attention produces the largest differentials: price momentum High-Low spread of 234-308 bps/month, and earnings momentum spreads of 180-265 bps/month in favour of low-attention stocks. Bloomberg AIA yields earnings momentum High-Low differences of 132-159 bps/month. Results are robust to cross-sectional regressions, independent portfolio sorts, size controls, and exclusion of momentum crash months.
Don’t Mix What Should Be Separated: Why Combining Value and Momentum Signals Destroys Alpha#
This paper addresses a practical but underexplored question in factor investing: how should value and momentum signals be combined within a portfolio? While the negative correlation between value and momentum is well-documented and provides substantial diversification benefits, the existing literature has devoted limited attention to whether these factors should be blended into a single composite ranking or maintained as independent portfolio sleeves. The author conducts a rigorous empirical comparison of two approaches using the Top 1,000 U.S. equities over the period 2000–2026, with long/short dollar-neutral portfolios, monthly rebalancing, and 100 positions per side. The combined ranking method integrates value and momentum signals into a single composite score, while the separate sleeves method maintains independent value and momentum long/short portfolios within a multi-strategy book.
The central finding is that the separate sleeves approach delivers superior risk-adjusted performance despite producing a lower raw return. The key mechanism is that the combined ranking method structurally dilutes the negative correlation between value and momentum return streams, destroying the very diversification benefit that justifies combining these factors in the first place. The separate sleeves framework preserves this negative correlation at −0.349, which functions as an organic diversification mechanism. A volatility-matched comparison confirms the separate sleeves methodology outperforms by 52 basis points annualised when both strategies are equalised at the same risk level.
Key metrics: the combined ranking method produces a higher annualised return of 3.01% versus 2.73% for separate sleeves, but the separate sleeves approach achieves a higher Sharpe ratio (0.168 vs 0.157), substantially lower volatility (5.51% vs 7.71%), and a markedly shallower maximum drawdown (−17.48% vs −26.61%). The paper notes practical trade-offs including increased trading intensity and position count inherent to the separate sleeves method, tail risks associated with shorting momentum losers, and the need for further validation across international markets.
By Carlos Morales (Independent).
Mentioned by Carlos Morales Martínez in this LinkedIn discussion: “Most quant investors combine value and momentum into one composite ranking. It seems logical: put everything in one model, let the system pick the best stocks. But my research shows the opposite works better.”
The Price Impact of Order Book Events#
This paper studies the price impact of order book events — limit orders, market orders, and cancellations — using the NYSE TAQ data for 50 U.S. stocks. Rama Cont, Arseniy Kukanov, and Sasha Stoikov show that price changes are mainly driven by the order flow imbalance (OFI), defined as the net of buy-side minus sell-side activity at the best bid and ask. They find a linear relation between OFI and contemporaneous price changes, with a slope that varies inversely with market depth. The relationship is stable across stocks, sample periods, and intraday time scales, in contrast to the noisier “square-root” price impact law based on volume alone.
Our summary: rather than modelling price impact as a function of traded volume, the authors argue that the relevant state variable for high-frequency price dynamics is the imbalance in the full event stream at the top of book. Cancellations and limit-order arrivals carry as much information as executions, and aggregating them into a signed flow gives a compact, linear predictor of mid-price moves. This reframes microstructure prediction around a directly observable ladder quantity that can be computed from raw Level-1/Level-2 feeds, and provides a theoretical bridge explaining why the classical volume-based square-root law appears at coarser horizons.
Key metrics: using 50 NYSE stocks over April–June 2008, the authors report a linear regression of 10-second mid-price changes on OFI with R² values typically in the 50–70% range across stocks, far exceeding the explanatory power of trade volume alone. The OFI price-impact coefficient is inversely proportional to average market depth, with the relationship holding across time-of-day buckets and different aggregation intervals from seconds to minutes.
Queue Imbalance as a One-Tick-Ahead Price Predictor in a Limit Order Book#
Martin D. Gould and Julius Bonart investigate whether bid/ask queue imbalance at the top of a limit order book can predict the direction of the next mid-price move. Using LOBSTER data for 10 liquid Nasdaq stocks, they build both a binary classifier for the sign of the next mid-price change and a probabilistic classifier estimating the probability of an upward move, fitted via logistic regression on the queue imbalance ratio. They document a strongly statistically significant relationship between imbalance and subsequent price direction, especially for large-tick stocks where the queue state dominates short-horizon dynamics.
Our summary: the authors formalise the common HFT intuition that “the thin side gets picked off first” as a rigorous, testable one-tick-ahead predictor. Rather than using raw queue sizes, they show the correct state variable is the normalised ratio I = Qᵇ / (Qᵇ + Qᵃ), and that a simple logistic regression on this single scalar already captures much of the predictable short-term structure. They also test a semi-parametric local logistic regression variant that fits the relationship non-parametrically, which yields modest improvements at the cost of more computation. The result is a minimal, transparent baseline that any microstructure model should beat before adding complexity.
Key metrics: for large-tick stocks, the logistic queue-imbalance classifier outperforms the null baseline substantially, achieving strong McFadden pseudo-R² values and classification accuracies well above 50% on the next mid-price move. For small-tick stocks, the improvement is more modest, reflecting the diminished role of top-of-book state when price levels are finely spaced. Local logistic regression adds a small but statistically meaningful boost over standard logistic regression across the universe.
The Micro-Price: A High-Frequency Estimator of Future Prices#
Sasha Stoikov defines the micro-price as the limit of a sequence of martingale mid-price estimates conditioned on the order book state. Formally, Pᵗᵐⁱᶜʳᵒ = lim Pᵗⁱ, where Pᵗⁱ = E[M_{τᵢ} | ℱₜ] and τ₁, …, τₙ are the (random) times at which the mid-price next changes. The order book state is summarised by the triple (Mₜ, Iₜ, Sₜ) — mid-price, imbalance Iₜ = Qᵇ/(Qᵇ+Qᵃ), and bid-ask spread Sₜ — and assumed to form a Markov process whose dynamics are independent of the price level. Under these assumptions, the micro-price is a Markovian, martingale, and computationally tractable adjustment to the mid-price.
Our summary: the paper provides a principled answer to the question “what is the fair value of an asset given its order book?” The mid-price is not a martingale because of the bid-ask bounce, and the volume-weighted mid-price Mʷ = I·Pᵃ + (1−I)·Pᵇ produces counter-intuitive behaviour (e.g., the “fair” price can move down after a cancellation on the ask side). Stoikov’s micro-price fixes both issues by iterating the conditional expectation until convergence, yielding a fair value that lives between the bid and ask and reacts smoothly to changes in imbalance and spread. The resulting estimator can be pre-computed as a table indexed by (I, S) and used as a continuous predictive signal rather than a discrete trading trigger. Empirically, it is a better short-term predictor than either the mid-price or the weighted mid-price and serves as a standard building block for high-frequency market making and execution models.
Key metrics: Stoikov reports empirical results on Nasdaq data for liquid stocks showing that the micro-price dominates both the mid-price and the weighted mid-price as a predictor of future mid-price realisations at horizons of a few ticks. The micro-price adjustment can be material — a significant fraction of the spread for imbalanced books — and is reported as a smooth, bounded function of imbalance that vanishes at the extremes.
Deep Order Flow Imbalance: Extracting Alpha at Multiple Horizons from the Limit Order Book#
Petter N. Kolm, Jeremy Turiel, and Nicholas Westray apply deep learning to high-frequency return prediction for 115 Nasdaq-listed stocks using the full depth of the limit order book. Their central finding is that models trained on order flow features (stationary increments at each level of the book) significantly outperform models trained on raw order book states (volumes at each level). They show that simple LSTM and feed-forward neural network architectures trained on multi-level order-flow imbalance features deliver superior predictive performance to more elaborate architectures fed raw state, and the predictions remain useful across multiple forecasting horizons.
Our summary: the paper generalises the Cont–Kukanov–Stoikov OFI idea by (a) extending order flow imbalance beyond the top of book to all quoted levels, (b) making the features properly stationary so that neural networks can learn from them without collapsing, and (c) comparing forecasting performance across horizons rather than at a single fixed lag. The result is a practical recipe for feature engineering in deep microstructure models: instead of feeding raw book snapshots into a network and hoping it learns the right invariance, construct multi-level OFI features first and let the network focus on the nonlinear dynamics. The authors also document cross-sectional heterogeneity — some stocks are “information-rich” with higher predictability, and useful stock-specific forecasts extend to roughly two average price changes.
Key metrics: across 115 Nasdaq stocks, deep OFI models achieve materially higher out-of-sample R² than models trained on raw book states at short horizons (seconds to tens of seconds). Forecasting performance degrades gracefully with horizon but remains significant out to approximately two average mid-price moves per stock. LSTM variants give a modest lift over feed-forward networks once the input is already stationary OFI, suggesting most of the predictable signal lives in the feature construction rather than the architecture.
Crypto Carry#
Maik Schmeling, Andreas Schrimpf, and Karamfil Todorov (BIS Working Paper 1087) analyse the dynamics of the carry — the difference between futures and spot prices — in bitcoin and ether derivatives. They document that crypto carry can exceed 40% per annum with substantial time variation and trace this to two forces: demand from smaller trend-chasing investors seeking leveraged long exposure, and limited arbitrage capital due to regulatory, funding, and margin frictions on the short side.
Our summary: this is the most rigorous academic treatment of the crypto cash-and-carry basis and its predictive content. Interest-rate differentials explain almost none of the variation in crypto carry; the dominant driver is a time-varying convenience yield linked to speculative demand for long leverage. Crucially, high carry is not a free lunch — it acts as a crash-risk premium. The paper explicitly shows that a 10% rise in standardised carry predicts roughly a 22% increase in short-futures liquidations as a fraction of total open interest over the following month, and is associated with richer option-implied crash-risk insurance. This reframes the perp-spot basis not just as an arbitrage target but as a state variable for tail risk.
Key metrics: the authors report crypto carry reaching peaks above 40% annualised across bitcoin and ether futures venues (CME and crypto-native exchanges). Predictive regressions of one-month-ahead short-side liquidations (scaled by open interest) on standardised carry produce statistically significant positive coefficients, implying ~2.2% liquidation response per 1% carry standardisation. Similar predictive content shows up in option skew changes. The carry premium is orders of magnitude larger than in traditional FX carry trades, and the drivers are structurally different.
Fundamentals of Perpetual Futures#
Songrun He, Asaf Manela, Omri Ross, and Victor von Wachter derive no-arbitrage prices for perpetual futures contracts in both frictionless and cost-adjusted markets. Unlike fixed-maturity futures, perpetual futures have no expiry date, so the usual convergence argument that pins the price to spot at maturity does not apply. Instead, alignment with spot is enforced only by the periodic funding mechanism. The paper derives explicit bounds on the perp-spot deviation in the presence of trading costs and funding frictions, and empirically characterises those deviations across major crypto venues.
Our summary: this paper provides the theoretical backbone for treating the perp-spot basis and funding rate as joint predictive signals. The key result is that perpetual futures are not guaranteed to track spot, so any persistent basis deviation is informative. Deviations are larger in crypto than in traditional FX perpetual analogues, comove strongly across coins (suggesting a common market-wide factor), and decline over time as markets mature — both of which are testable predictions for any trading model that uses basis features. The authors further show that the size of the current deviation is itself a predictor of the next funding payment, which ties the two signals together in a theoretically grounded way.
Key metrics: the authors construct an explicit arbitrage strategy — going long (or short) spot and taking the opposite side in the perpetual — that captures funding payments on BitMEX, Binance, FTX, and OKX. The reported Sharpe ratios of the arbitrage portfolio are high across all exchanges, with the cross-venue strategy delivering materially better risk-adjusted returns than any single-venue implementation. Measured deviations of perp prices from their no-arbitrage values are an order of magnitude larger in crypto than in perpetual-style FX contracts.
Perpetual Futures Pricing#
Damien Ackerer, Julien Hugonnier, and Urban Jermann derive closed-form no-arbitrage prices for linear, inverse, and quanto perpetual futures contracts in both discrete and continuous time. The paper provides the first fully tractable pricing framework that accounts for the random-maturity nature of perpetuals (the funding mechanism can be viewed as a stream of infinitesimal settlements) and for the distinct payoff geometries of the three main contract types used in crypto derivatives markets.
Our summary: this is the canonical theoretical reference for anyone modelling perpetuals. The authors show that under standard diffusion assumptions, the perpetual-spot basis and the funding rate together encode information about expected spot returns and convenience yields, formalising the intuition behind all the empirical basis/funding predictability papers. Their framework allows a clean decomposition of observed basis into a “fair” component implied by the funding mechanism and a deviation term that can be interpreted as either an arbitrage opportunity or a risk premium. The paper is calibrated to bitcoin perpetual data and provides an explicit benchmark against which to measure empirical dislocations.
Key metrics: under the authors’ calibration to BTC perpetuals, observed funding rates and basis deviations line up well with the model’s predictions in normal market conditions but diverge in stress episodes. The divergence itself carries information content, consistent with the crash-prediction findings of Schmeling et al. and Gornall et al.
Anatomy of Cryptocurrency Perpetual Futures Returns#
Yi Cao, Pengfei Luo, Yuhan Cheng, and Yizhe Dong (University of Edinburgh) develop a cost-of-carry model tailored specifically to perpetual contracts and use it to decompose perpetual futures returns into a spot premium, a log basis, and an expected funding spread. They then run the most comprehensive horse race to date of candidate return predictors in the cross-section of crypto perpetuals, testing 134 signals drawn from the basis, momentum, volume, size, and volatility literatures.
Our summary: this is the paper to cite when arguing that basis is the dominant cross-sectional predictor in crypto perpetual futures. Of the 134 candidates tested, 48 deliver statistically significant cross-sectional spreads at the 5% level, and every single one of them is spanned by a two-factor model built from a basis factor and a price–volume factor. In other words, after controlling for basis and price–volume, no other signal in their universe carries incremental predictive power. This is a strong empirical argument for placing the perp-spot basis at the centre of any cross-sectional crypto strategy and treating other signals as secondary.
Key metrics: cross-sectional sorts on basis deliver statistically significant long-short spreads in perpetual futures returns, and the basis factor is the single strongest predictor among the 134 candidates tested. The two-factor (basis + price–volume) model prices all 48 significant anomalies with zero significant alphas. The paper does not report a single headline Sharpe but documents the basis factor as economically and statistically dominant.
An Empirical Investigation on Risk Factors in Cryptocurrency Futures#
Yeguang Chi, Wenyan Hao, Jiangdong Hu, and Zhenkai Ran (Journal of Futures Markets 2023) test basis, momentum, and basis-momentum factors — borrowed from the commodity futures literature — in the cross-section of cryptocurrency futures from 2017 to 2021. They construct long-short factor portfolios sorted on each signal and measure their premia at daily, weekly, and monthly frequencies.
Our summary: the paper confirms that basis is the strongest single predictor of cross-sectional crypto futures returns, echoing Cao et al. A novel finding is that the basis-momentum factor — a robust premium in commodities — has a statistically significant raw return in crypto but is entirely subsumed by the basis factor once you control for it. Most importantly for implementation, the predictive content of basis is high-frequency: daily factor returns are strongest, weekly returns are weaker, and monthly returns are insignificant. This matters a lot for anyone building a basis-factor strategy — the alpha lives at daily or faster horizons.
Key metrics: daily long-short portfolios sorted on basis deliver statistically significant excess returns and Sharpe ratios that exceed momentum and basis-momentum factors in the same universe. The basis premium shrinks monotonically as the rebalancing frequency drops from daily to weekly to monthly, with the monthly premium indistinguishable from zero. Commodity-style basis-momentum has zero incremental alpha after controlling for basis.
The Risk and Return of Cryptocurrency Carry Trade#
Zhenzhen Fan, Feng Jiao, Lei Lu, and Xin Tong build a cryptocurrency carry trade analogous to the classical FX carry trade, going long high-funding-rate coins and shorting low-funding-rate coins on perpetual futures markets. They run the strategy across a broad universe of perpetual contracts and decompose the returns to understand which leg is doing the work and how well the returns are explained by standard crypto risk factors.
Our summary: this is the cleanest cross-sectional implementation of “funding rate as a predictive signal” in the literature. The carry trade works, but the returns are heavily concentrated on the short leg — i.e., the money comes from receiving funding on contracts where funding is extreme, not from the long side. The returns are not spanned by standard crypto factors (BTC, size, momentum, volatility), so funding is capturing a genuinely distinct premium. The paper also highlights important differences from FX carry: crypto carry is more volatile, more prone to crashes, and structurally dependent on exchange-specific funding caps.
Key metrics: the cross-sectional funding-rate carry strategy delivers roughly 43.4% annualised return with a Sharpe ratio of about 0.74, with the short leg contributing the majority of both the return and the alpha. The strategy loads significantly on a crypto illiquidity factor and retains positive alpha against a five-factor model built from standard crypto risk factors.
The Crypto Carry Trade#
Nicolas Christin, Bryan Routledge, Kyle Soska, and Ariel Zetlin-Jones (Carnegie Mellon) study the long-spot, short-perpetual cash-and-carry trade in bitcoin. They document very large in-sample Sharpe ratios for the strategy and attribute the profits to differences of opinion between leveraged long speculators and capital-constrained arbitrageurs, amplified by exchange funding caps.
Our summary: this is the paper that first quantified just how large the cash-and-carry carry can be in crypto, and it is the reference point for anyone who claims that “funding rate = free money”. The authors carefully separate USDT-settled perps (where carry is richest) from coin-settled perps (where the premium is smaller but still very large) and show that the USDT/coin gap maps directly onto differences in collateral constraints and funding caps. They also decompose carry into a mark-price premium plus an interest component, which lines up with the Ackerer–Hugonnier–Jermann pricing framework. The reported Sharpes are in-sample and benefit from not fully accounting for liquidation risk and exchange risk — but even after generous haircuts, the strategy is extraordinarily profitable relative to traditional carry trades.
Key metrics: in-sample annualised Sharpe ratios of approximately 12.8 for USDT-settled bitcoin perpetuals and 7.0 for coin-settled bitcoin perpetuals. Returns are tightly linked to the level and stickiness of the funding rate and are constrained by exchange-imposed funding caps. Tick-level data from Binance and BitMEX.
Perpetual Futures and Basis Risk: Evidence from Cryptocurrency#
Will Gornall, Martin Rinaldi, and Yizhou Xiao compare perpetual versus quarterly bitcoin futures contracts and argue that the small, frequent funding payments embedded in perpetuals materially reduce basis risk during market crises. They develop a tractable model of capital-constrained arbitrage and test its predictions on BitMEX, Binance, and CME futures around large spot moves.
Our summary: this paper is the best explanation of why perpetuals dominate crypto derivatives volume. Dated futures are structurally fragile in crises because the basis is sensitive to the remaining time to expiry and to collateral availability, so during stress episodes the quarterly basis can blow out by 8–10%, eating through all of the expected carry in a single day. Perpetuals, by contrast, settle PnL continuously via funding payments, which damps the basis dislocation to around 3% in the same stress episodes. The practical implication for basis-trading strategies is that perpetuals are a far better instrument than quarterlies for cash-and-carry during volatile regimes, and funding-rate-based entry rules inherit this crisis-resilient property.
Key metrics: during large spot moves, quarterly bitcoin futures dislocate from spot by 8–10%, while perpetuals dislocate by only ~3%. Drawdowns of common cash-and-carry arbitrage strategies are cut by more than half when implemented on perpetuals instead of quarterlies. The comparison holds across BitMEX, Binance, and CME.
Predictability of Funding Rates#
Emre Inan runs an out-of-sample forecasting study of perpetual futures funding rates on bitcoin contracts traded on Binance and Bybit. Using double-autoregressive (DAR) models, the paper evaluates forecast quality against a no-change benchmark across multiple horizons and tests for regime-dependence of predictability.
Our summary: most of the crypto basis literature treats the funding rate as a contemporaneous signal (carry, cross-section). This paper instead asks the simpler but important question: can you predict the next funding rate from its own history? The answer is yes — DAR models outperform the random-walk benchmark in both forecast error and directional accuracy across Binance and Bybit BTC perpetual contracts. The predictability is time-varying, suggesting regime dependence, which matches the stylised fact that funding rates mean-revert slowly in range-bound regimes and jump sharply around trend reversals. This is a useful building block for any strategy that needs a forward estimate of funding.
Key metrics: DAR-based forecasts beat the no-change benchmark in RMSE, MAE, and directional accuracy across multiple horizons on both Binance and Bybit. Predictability varies through the sample — stronger in stable regimes, weaker during stress — consistent with regime-dependent funding-rate dynamics.
Arbitrage Opportunities and Efficiency Tests in Crypto Derivatives#
Carol Alexander, Xi Chen, Jun Deng, and Tianyi Wang test joint efficiency of bitcoin and ether options and perpetual futures markets using a fiat-currency-free put–call parity relation. They identify the determinants and time-variation of cross-market arbitrage profits and quantify how arbitrage profitability has evolved as crypto derivatives markets have matured.
Our summary: the paper extends the basis/funding predictability toolkit to include options. Their put–call parity relation ties together spot, perpetual futures, and options on a single instrument and provides a unified framework for detecting mispricings across all three markets. The arbitrage strategies that exploit these mispricings remain profitable after slippage, especially during high-volume and congestion regimes, although the opportunities are shrinking over time as liquidity improves. BTC markets are measurably more efficient than ETH markets. For a basis-trading model, this paper provides a rigorous benchmark against which to measure whether observed deviations represent tradable alpha or just execution friction.
Key metrics: arbitrage strategies linking perpetuals and options remain profitable even after slippage, especially in high-volume and congestion regimes. BTC derivatives markets are more efficient than ETH, and efficiency is improving over time, particularly for options with maturity ≥ 15 days. Deribit-sourced option data combined with perp data from major venues.
The Relationship Between Arbitrage in Futures and Spot Markets and Bitcoin Price Movements#
Takahiro Hattori and Ryo Ishida use intraday CBOE bitcoin futures and Gemini spot data to reconstruct the actual cash-and-carry condition faced by arbitrageurs in real time. They test how arbitrage profit opportunities vary across calm and crash regimes and relate the dislocation to capital constraints on the short side.
Our summary: this paper documents the uncomfortable truth about basis-based trading in crypto: the basis becomes most attractive exactly when you are least able to trade it. In normal markets, cash-and-carry opportunities are rare and small. During bitcoin crashes, the basis blows out dramatically — but exactly those episodes are when arbitrage capital is constrained, margin requirements jump, and liquidation risk is highest. This is a textbook limits-to-arbitrage story applied to crypto, and it mirrors Gornall et al.’s finding that quarterlies dislocate much more than perpetuals during crises. For anyone building a basis-trading strategy, the practical lesson is to size for the tails, not the averages.
Key metrics: the paper documents statistically significant widening of cash-and-carry arbitrage spreads during bitcoin crash episodes, with the basis reaching levels many standard deviations above normal-regime values. Arbitrage is typically unavailable outside these stress regimes. High-frequency CBOE-Gemini matched-pair data.
The Two-Tiered Structure of Cryptocurrency Funding Rate Markets#
This study builds a high-frequency panel of perpetual funding rates spanning 26 exchanges (11 centralised, 15 decentralised) covering 749 symbols and roughly 35.7 million 1-minute observations over 8 days. It studies price discovery and arbitrage between CEX and DEX funding markets using Hasbrouck-style information share decompositions and transaction-cost-adjusted arbitrage backtests.
Our summary: this is the first paper to rigorously quantify the directional information flow between CEX and DEX funding rates. CEX funding leads DEX funding with no detectable reverse causality — about a 61% higher integration on the CEX side — meaning any cross-venue basis signal should be interpreted as CEX setting the mark and DEX adjusting to it. The paper also documents that roughly 17% of observations exhibit ≥20 basis point arbitrage spreads between venues, but only about 40% of the largest opportunities yield positive PnL after transaction costs. For anyone building a multi-venue funding-rate strategy, this provides both a benchmark (how much alpha is theoretically available) and a sobering check (how much survives execution).
Key metrics: CEX funding markets lead DEX markets with ~61% higher price discovery integration; ~17% of 1-minute observations show ≥20bp cross-venue arbitrage spreads; only ~40% of the top opportunities yield positive transaction-cost-adjusted PnL. High-frequency panel of 26 exchanges and 749 symbols.
Exploring Risk and Return Profiles of Funding Rate Arbitrage on CEX and DEX#
This paper reports an empirical study of delta-neutral funding-rate arbitrage strategies implemented on both centralised (Binance, BitMEX) and decentralised (Drift, ApolloX) perpetual venues across BTC, ETH, XRP, BNB, and SOL. The strategies long spot and short perpetuals (or vice versa) to harvest funding payments while eliminating directional price risk.
Our summary: this paper documents some of the highest in-sample Sharpe ratios ever reported for a crypto trading strategy, driven by the very wide funding spreads available on DEX venues that have lower competition and thinner arbitrage capital. It’s a useful reference point for the upper bound of what funding-rate harvesting can deliver under favourable conditions — but the reported numbers should be read with care, because the sample period is short, the DEX venues are niche, and the strategy’s capacity is limited by each venue’s liquidity. That said, the paper makes the useful point that the CEX/DEX funding-rate gap is a structural feature driven by market maturity rather than a transient inefficiency.
Key metrics: reported Sharpe ratios of approximately 23.55 on Drift and 6.50 on ApolloX for funding-rate arbitrage strategies, versus ~2.89 for a HODL benchmark. Up to 115.9% return over six months with maximum drawdown around 1.92%. Backtests cover BTC, ETH, XRP, BNB, and SOL perpetual contracts.
Multi-Level Order-Flow Imbalance in a Limit Order Book#
Ke Xu, Martin D. Gould, and Sam D. Howison (University of Oxford) define multi-level order-flow imbalance (MLOFI) as a vector quantity that tracks the net buy-sell flow at each quoted price level of a limit order book, not just the best bid and ask. Using high-resolution Nasdaq data for 6 liquid stocks, they fit a simple linear regression of the contemporaneous mid-price change onto the stacked MLOFI vector and measure how out-of-sample goodness-of-fit changes as deeper book levels are added.
Our summary: this is the canonical extension of Cont–Kukanov–Stoikov’s top-of-book OFI to the full book depth. The methodology is deliberately simple — a linear model, no deep learning — and the point is to isolate the marginal information content of each additional price level. The authors find monotonic R² improvements for every one of the 6 stocks tested as levels are added, which is strong evidence that resting liquidity deeper in the book carries real information about the short-term price-formation process and is not just noise. For any microstructure model, the practical takeaway is to feature-engineer MLOFI across all levels rather than truncating at the top of book.
Key metrics: out-of-sample R² monotonically improves with each additional price level for all 6 Nasdaq stocks tested, with the marginal contribution diminishing but remaining positive well beyond the top of book. The linear MLOFI model is shown to meaningfully outperform best-level OFI.
Deep Limit Order Book Forecasting#
Antonio Briola, Silvia Bartolucci, and Tomaso Aste (University College London) use state-of-the-art deep learning models to forecast short-horizon mid-price moves from limit order book data for a heterogeneous set of Nasdaq stocks. They release LOBFrame, an open-source codebase for large-scale LOB processing and deep model benchmarking, and they introduce a novel operational evaluation metric based on the probability of accurately forecasting complete transactions rather than mid-price moves.
Our summary: this is the paper to cite whenever someone claims a high-accuracy deep LOB model. The authors show that (a) standard classification metrics (F1, accuracy) on mid-price direction systematically overstate the economic usefulness of a forecaster, and (b) the same architecture performs very differently across stocks with different microstructural regimes, so one-size-fits-all benchmarking is misleading. Their proposed operational metric asks “given the forecast, what is the probability of capturing a complete round-trip transaction at a reasonable spread?” — a framing that aligns the model with actual trading P&L rather than label guessing. The stark headline from the paper is that “high forecasting power does not necessarily correspond to actionable trading signals”, which should be the warning label on every deep LOB result.
Key metrics: the paper reports that forecasting performance varies strongly with stock-level microstructural regime and that standard classification metrics routinely overstate actionability. The operational transaction-forecasting metric is substantially harder to beat and brings deep model performance closer to economic reality.
Cross-Impact of Order Flow Imbalance in Equity Markets#
Rama Cont, Mihai Cucuringu, and Chao Zhang (Oxford) investigate cross-asset order-flow-imbalance effects in a multi-asset equity setting. They propose a systematic way to combine OFIs at the top levels of the limit order book into an integrated OFI variable and test whether lagged cross-asset OFIs add predictive power for future returns beyond the contemporaneous impact of own-asset OFI.
Our summary: this paper answers an important but previously neglected question — is the OFI of stock A useful for predicting the return of stock B? The authors show first that integrating OFI across multiple LOB levels (following the MLOFI line of work) dominates a best-level-only construction, and second that once you have properly integrated multi-level own-OFI, cross-asset contemporaneous cross-impact vanishes — sparse single-asset models explain as much as dense cross-impact models. However, lagged cross-asset OFI does improve short-horizon return forecasting, and that lead-lag information decays rapidly. Practical takeaway: for contemporaneous price-impact modelling, focus on own-asset multi-level OFI; for short-horizon return prediction, a small amount of lagged cross-sectional OFI adds value.
Key metrics: integrated multi-level OFI materially outperforms best-level OFI for contemporaneous impact; lagged cross-asset OFIs add meaningful forecasting power over short horizons with rapid decay. Published in Quantitative Finance Vol 23 Issue 10 (2023), pp 1373–1393.
Bitcoin Wild Moves: Evidence from Order Flow Toxicity and Price Jumps#
Atiwat Kitvanitphasu, Khine Kyaw, Tanakorn Likitapiwat, and Sirimon Treepongkaruna study the dynamic relationship between order-flow toxicity (measured via VPIN) and Bitcoin price jumps using high-frequency data in a vector autoregression framework. Published in Research in International Business and Finance (2026), the paper integrates behavioural finance with market microstructure to understand how informed trading drives jumps and how traders respond.
Our summary: this is the single most directly relevant paper for “can microstructure features predict future large crypto moves?” Unlike the bulk of the OFI/LOB literature, which is pinned to horizons of seconds to a few price changes, this paper explicitly shows that VPIN predicts future Bitcoin price jumps in a VAR framework — not just contemporaneously. It also documents positive serial correlation in both VPIN and jump size (so both toxicity and jumps cluster), time-of-day and day-of-week patterns, and weak reverse feedback from jumps into VPIN. The jump detection is robust to the Jiang–Oomen test that explicitly accounts for microstructure noise. For a strategy trying to nowcast or short-horizon-forecast BTC crashes or breakouts, this is the primary academic anchor.
Key metrics: VPIN Granger-causes Bitcoin jump occurrence and size in a bipolar VAR; positive autocorrelation in VPIN and in jump magnitudes; significant time-of-day / day-of-week seasonality in VPIN levels; results are robust across Lee–Mykland and Jiang–Oomen jump detection tests.
High-Frequency Jump Analysis of the Bitcoin Market#
Olivier Scaillet, Adrien Treccani, and Christopher Trevisan use the leaked Mt. Gox database — covering June 2011 through November 2013 with trader identifiers at a tick-transaction level — to study the jump dynamics of Bitcoin. The data provides a rare opportunity to observe an emerging retail-focused, highly speculative and unregulated market at the individual-trader level.
Our summary: this is the foundational jump-clustering paper for Bitcoin and a primary reference for anyone modelling crypto microstructure tail events. The authors document that jumps are frequent and temporally clustered, and they identify specific microstructure precursors — elevated order-flow imbalance, an increased share of aggressive (liquidity-taking) traders, and a widening bid-ask spread — that predict jump arrivals. Jumps cause short-term spikes in activity and illiquidity and, importantly, are associated with persistent changes in the price level, so the price does not mean-revert through them. The finding that OFI and aggressive-trader share predict jumps provides early academic support for the intuition that current feature-engineering projects try to operationalise.
Key metrics: jumps are frequent and cluster in time; OFI, aggressive-trader share, and widening bid-ask spread all predict jump arrivals with statistically significant coefficients; jumps correspond to short-term illiquidity spikes and persistent (non-reverting) price moves.
Good and Bad Self-Excitation in Bitcoin: Asymmetric Self-Exciting Jumps#
Chuanhai Zhang, Zhengjun Zhang, Mengyu Xu, and Zhe Peng (Economic Modelling, 2023) model asymmetric self-exciting jump clustering in Bitcoin returns using a bivariate Hawkes-type jump process that separates positive (“good”) and negative (“bad”) jumps. The paper studies how each jump type contributes to subsequent jump intensity.
Our summary: this paper formalises the intuition that Bitcoin’s tail risk is skewed — bad news begets more bad news faster than good news begets more good news. The bivariate Hawkes setup allows separate branching ratios for good and bad jumps and tests whether they are different. The empirical answer is clear: negative self-excitation is strictly stronger than positive self-excitation, with longer persistence. The asymmetry is much more pronounced in bear regimes than in bull regimes. For any strategy that wants to exploit jump clustering, this paper is the reference for why a single symmetric Hawkes process misses structurally important behaviour, and why the downside jump channel deserves its own modelling.
Key metrics: negative jumps trigger substantially more subsequent volatility than positive jumps; the asymmetry holds in bear markets and is muted in bull markets; aftershocks of bad self-excitation persist longer than those of good self-excitation.
Nowcasting Bitcoin’s Crash Risk with Order Imbalance#
Dimitrios Koutmos (Texas A&M – Corpus Christi) and Wang Chun Wei (Realindex Investments) build an early-warning system for Bitcoin price crashes using generalized extreme value (GEV) and logistic regression models. Their feature set integrates order-flow imbalance with blockchain-activity and network-value controls, on daily Bitcoin data from 1 April 2013 to 15 January 2023 (3,577 days).
Our summary: this is the clearest-cut empirical study of order imbalance as a crash predictor on Bitcoin at the daily horizon. The authors define a crash as a daily return at least one standard deviation below the mean (~−3.57%), giving 318 crash days or 8.89% of the sample. They show that adding OFI to a crash-nowcasting model dramatically improves explanatory power over models that use only blockchain / network fundamentals. The best logistic specification (model 3.5) reaches a McFadden pseudo-R² of 30.74%, with the GEV variant at 29.95% — both extremely high for a binary daily crash model. This is strong empirical backing for the idea that daily-aggregated order imbalance is an input to a BTC tail-risk timing model, even if it is not a timing signal for normal market moves.
Key metrics: McFadden pseudo-R² of 30.74% (logistic) and 29.95% (GEV) on the best specification; 318 crash days out of 3,577 (8.89% base rate); OFI materially improves crash nowcasting versus on-chain and network-value controls alone. Published in Review of Quantitative Finance and Accounting (2023).
Where Is the Price of Bitcoin Determined? Price Discovery in a Fragmented Market#
Riccardo Cosenza and Simon Stalder investigate price discovery for Bitcoin across regulated and unregulated spot and perpetual-futures venues using high-frequency data in both fiat and stablecoin markets. They apply Hasbrouck information shares, Gonzalo–Granger component shares, and Putniņš information-leadership shares in a trivariate VECM with FX, as well as bivariate VECMs with prices converted to a common currency.
Our summary: this is the reference paper for the question “which venue sets the Bitcoin price?” The answer is blunt: unregulated crypto-native venues (primarily Binance spot and perpetual futures) dominate price discovery during the vast majority of trading hours. Regulated venues like Coinbase gain relative importance around specific fixing windows, notably the NY 4pm fixing used as an ETF NAV reference, but outside those windows they follow. Lower transaction costs, higher volume, and higher volatility all enhance a venue’s price-discovery share. For any cross-venue aggregation strategy, this paper tells you where to look (Binance) and which intervals may exhibit different dynamics (NY afternoon). It also raises real regulatory concerns about the reference price for Bitcoin ETFs.
Key metrics: Binance’s information share dominates across most hours of the trading day; Coinbase’s share rises around the NY 4pm fixing; information-leadership shares align with the intuition that lower-cost, higher-volume, higher-volatility venues lead price formation.
Pricing Efficiency and Arbitrage in the Bitcoin Spot and Futures Markets#
Seungho Lee, Nabil El Meslmani, and Lorne N. Switzer (Concordia University) study Bitcoin pricing efficiency using spot prices alongside all CBOE and CME futures contracts traded from January 2018 to March 2019. They test cost-of-carry and no-arbitrage conditions and the predictive content of the futures basis for subsequent spot moves.
Our summary: this is the empirical complement to He–Manela–Ross–von Wachter’s theoretical perpetuals-pricing work, focused on the dated futures era (CBOE XBT, CME BTC). The authors find that the futures basis has some predictive power for future spot-price changes and for the risk premium but is a biased predictor, so it cannot be used as a clean standalone signal. Cointegration tests confirm that futures prices are biased predictors of spot. Arbitrage deviations are persistent and, crucially, widen substantially around Bitcoin-specific events — security incidents (hacks) and alt-coin launches — which is the same limits-to-arbitrage pattern documented by Hattori–Ishida. Published in Research in International Business and Finance Vol 53 (2020).
Key metrics: basis has predictive power for future spot moves but is a biased predictor; cointegration/VECM tests confirm the bias; no-arbitrage deviations widen around hacks and alt-coin launches, matching the capital-constraints-in-crises pattern seen in other basis-crypto studies.
The 10 Reasons Most Machine Learning Funds Fail#
Marcos López de Prado (Journal of Portfolio Management 2018) distils the ten most common methodological mistakes that cause quantitative ML funds to fail in practice. The paper is the canonical practitioner reference for the methodology family that follows: information-driven bars, triple-barrier labelling, meta-labelling, sample uniqueness, purged K-fold cross-validation, combinatorial purged CV, deflated Sharpe ratio, and probability of backtest overfitting.
Our summary: this is the single most important methodological paper for anyone doing serious machine-learning research on financial time series. It is prescriptive rather than empirical — the value is in the enumeration of pitfalls and the recipes to fix them. The ten pitfalls and their proposed remedies are: (1) the “Sisyphus” lone-PM research model → an assembly-line meta-strategy paradigm; (2) research-through-backtesting → feature-importance-driven research; (3) fixed chronological time bars → volume / dollar / information-driven bars; (4) integer differentiation → fractional differentiation; (5) fixed-time-horizon labelling → triple-barrier labelling; (6) learning direction and size jointly → meta-labelling; (7) ignoring the non-IID structure of financial data → uniqueness-weighted sample and sequential bootstrap; (8) leaky cross-validation → purging and embargoing; (9) walk-forward backtesting → Combinatorial Purged Cross-Validation (CPCV); (10) in-sample-maximised backtests → Deflated Sharpe Ratio. Every item on this list corresponds to a quantifiable source of bias or overfitting, and every one has a concrete fix. If a research project does not systematically address at least the last five items, its reported Sharpe is almost certainly inflated.
Key metrics: this is a methodological paper with no single headline number. The operational claim is that the vast majority of quantitative ML strategies fail out-of-sample because of one or more of the ten pitfalls, and that the listed corrections (CPCV, DSR, PBO, triple-barrier, meta-labelling, sample uniqueness) are individually necessary.
The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality#
David H. Bailey (Lawrence Berkeley National Lab / UC Davis) and Marcos López de Prado (Cornell) introduce the Deflated Sharpe Ratio (DSR) — a closed-form correction to the standard Sharpe ratio that accounts for (i) selection bias under multiple testing when many trial strategies are evaluated, and (ii) non-normal return distributions via higher-moment terms (skewness and kurtosis). Published in The Journal of Portfolio Management (2014).
Our summary: the Sharpe ratio is a very unreliable performance metric in any research pipeline that tests more than a handful of candidate strategies. DSR addresses the root problem by computing the probability that the observed Sharpe of the selected strategy exceeds a given benchmark Sharpe, conditional on the number of trials, the cross-sectional variance of trial Sharpes, and the skewness and kurtosis of the selected strategy’s returns. The result is a single probability that can be thresholded and compared across projects with very different research budgets. In practice, DSR is the mandatory deflator for any parameter-sweep or grid-search backtest; without it, reported Sharpes are systematically optimistic. DSR is one of the ten fixes enumerated in “10 Reasons Most Machine Learning Funds Fail” and is a direct prerequisite for publishable quantitative research.
Key metrics: methodological paper; provides the closed-form DSR formula and worked examples. The operational benefit is a probability-valued confidence statement about the true Sharpe, calibrated to the search budget used to find the strategy.
The Probability of Backtest Overfitting#
David H. Bailey, Jonathan M. Borwein, Marcos López de Prado, and Qiji Jim Zhu propose a general framework — Probability of Backtest Overfitting (PBO) — to assess the likelihood that a strategy selected as best in-sample will underperform the median strategy out-of-sample. They introduce Combinatorially Symmetric Cross-Validation (CSCV) as a model-free, non-parametric estimator of PBO. Published in the Journal of Computational Finance (2017).
Our summary: this is the companion paper to the Deflated Sharpe Ratio and the practical workhorse behind any honest evaluation of a multi-trial backtest. The authors demonstrate that standard hold-out and walk-forward evaluations are structurally unreliable for strategy selection — they consistently underestimate the true risk that the selected strategy is a lucky survivor of multiple testing. CSCV fixes this by partitioning the time series into S equally-sized blocks and exhaustively evaluating every combinatorially-symmetric assignment of blocks into in-sample and out-of-sample halves. For each configuration, it measures whether the IS winner underperforms the OOS median; the fraction of configurations in which this happens is the PBO estimate. The approach is model-free (no assumption about return distributions), non-parametric, and reasonably compute-efficient, and it directly quantifies overfitting risk as a single probability.
Key metrics: PBO is defined as the probability that the IS-best strategy underperforms the OOS-median strategy; CSCV gives reasonable empirical PBO estimates across examples in the paper; naive hold-out is shown to be systematically unreliable for strategy selection.
Clustered Feature Importance#
Marcos López de Prado introduces Clustered Feature Importance (CFI) to address a major failure mode of standard Mean-Decrease-Impurity (MDI) and Mean-Decrease-Accuracy (MDA) feature-importance measures: the “substitution effect” that distorts importances when two or more features share predictive power. The methodology groups correlated / redundant features into clusters using the Optimal Number of Clusters (ONC) algorithm and then computes importance at the cluster level. Subsequently incorporated into Machine Learning for Asset Managers (Cambridge, 2020).
Our summary: any research pipeline that selects features by MDI or MDA is vulnerable to collinear-feature substitution — two correlated features will split the true importance between them, causing both to appear weakly important and potentially getting dropped. CFI is the right fix: cluster correlated features first, then compute importance on the clusters, either by shuffling all members of a cluster jointly (clustered MDA) or by aggregating impurity decreases across cluster members (clustered MDI). The cluster count is data-driven via ONC, which uses a silhouette-t-statistic elbow over candidate k values. The result is a feature-importance ranking that correctly identifies the importance of groups of substitutable features instead of being diluted by their collinearity. For microstructure feature libraries where OFI, book imbalance, and CVD are heavily correlated, CFI is essentially mandatory.
Key metrics: methodological paper; CFI is demonstrated to be robust to both linear and non-linear substitution effects and to recover the correct ranking of relevant feature groups in simulations where standard MDI/MDA fail.
Does Meta-Labeling Add to Signal Efficacy?#
Ashutosh Singh and Jacques Francois Joubert (Hudson & Thames, 2022) empirically evaluate whether meta-labelling — a secondary machine-learning classifier layered on top of a primary trading signal to predict whether to act on that signal — improves signal efficacy. They combine event-based sampling (CUSUM) with triple-barrier labels and test on S&P 500 E-mini futures tick data from July 2011 to February 2019.
Our summary: meta-labelling is one of the ten fixes in López de Prado’s “10 Reasons” paper, and this whitepaper is the best compact empirical demonstration that the fix actually works. Two primary strategies are tested; for both, adding a meta-label classifier improves classification precision and accuracy substantially, and the improvements carry over (albeit more modestly) to the held-out test set. The intuition is that the primary signal decides the side of the trade, while the meta-label classifier decides the size — specifically whether to take the trade at all or skip it. Skipping low-confidence primary signals filters out a large share of false positives and dramatically reduces the noise in the P&L stream without requiring any change to the primary strategy itself.
Key metrics: Strategy 1 validation set — accuracy improves 20% → 77%, precision 0.21 → 0.39; Strategy 1 test (OOS) — accuracy 17% → 63%, precision 0.17 → 0.20. Strategy 2 validation — accuracy 37% → 56%, precision 0.37 → 0.42. Portfolio-level Sharpe and drawdown metrics also improve out-of-sample.
Algorithmic Crypto Trading Using Information-Driven Bars, Triple-Barrier Labeling and Deep Learning#
Przemysław Grądzki, Piotr Wójcik (University of Warsaw), and Stefan Lessmann (Humboldt-Universität zu Berlin) apply the full López de Prado methodological pipeline — information-driven sampling, triple-barrier labels, meta-labelling — to Bitcoin and Ethereum trading, combined with a deep learning classifier. Published in Financial Innovation (2025).
Our summary: this is the clearest end-to-end empirical application of the AFML pipeline to crypto. The authors benchmark fixed-time bars against CUSUM filters, range bars, volume bars and dollar bars, and replace next-bar return prediction with triple-barrier labelling that uses volatility-scaled up/down barriers and a time-out. The result is that information-driven sampling meaningfully outperforms time bars on both classification and trading metrics for BTC and ETH, and that the triple-barrier labels produce targets better aligned with realised trading outcomes than standard next-bar classification. For a crypto strategy that wants a credible methodological anchor, this paper is the empirical reference point: it shows that every piece of the AFML pipeline adds economic value when applied correctly to the right data.
Key metrics: information-driven bars (CUSUM, dollar, volume) improve both classification metrics and backtest-level trading metrics relative to time bars on BTC and ETH; end-to-end pipeline produces positive net-of-cost results. Specific numerical Sharpe / F1 values are in the paper tables.
Stock Price Prediction Using Triple-Barrier Labeling and Raw OHLCV Data: Evidence from Korean Markets#
Sungwoo Kang shows that simple deep learning models trained on raw OHLCV data can match more elaborate models — including XGBoost with engineered technical features — when the prediction target is created using an optimised triple-barrier labelling scheme. Evaluated on the Korean equity universe 2006–2024.
Our summary: this paper is a useful data point for two questions. First, it confirms that triple-barrier labelling extends beyond the high-frequency microstructure domain where it is usually demonstrated — it works at multi-day horizons on daily equity data. Second, it shows that feature engineering and labelling choices often matter more than model complexity: an LSTM fed only raw OHLCV matches a heavily-engineered XGBoost baseline once the label is defined sensibly. The authors identify the optimal triple-barrier configuration as a 29-day holding-period window with 9% barrier width, which they select to balance the label distribution. The best LSTM configuration is small (hidden size 8, window size 100), which is a useful reminder that model depth is rarely the bottleneck on financial time series.
Key metrics: optimal triple-barrier configuration of 29-day window with 9% barriers; best LSTM configuration has window size 100 and hidden size 8; LSTM on raw OHLCV matches XGBoost with technical indicators; full OHLCV input outperforms close-only or close+volume variants.
Time Series Momentum#
Tobias J. Moskowitz, Yao Hua Ooi, and Lasse Heje Pedersen (Journal of Financial Economics, 2012) document a remarkably broad time-series momentum effect across 58 liquid futures and forward contracts spanning equity indices, sovereign bonds, currencies, and commodities. Using a security’s own past return rather than its return relative to peers, they show that return persistence is strong from one to twelve months and then partially reverses at longer horizons.
Our summary: this is the canonical modern TSMOM paper. It matters because it separates “trend following” from the usual cross-sectional winner-minus-loser construction and shows that a security’s own lagged return has predictive power across very different asset classes. The paper also argues that most of the effect comes from positive auto-covariance rather than lead-lag structure across assets, and it links the profits to speculators effectively riding the trend while hedgers absorb the other side. If you need one foundational citation for multi-asset trend following, this is it.
Key metrics: the paper reports positive 12-month time-series momentum profits for all 58 contracts in the sample. A diversified TSMOM portfolio earns a Sharpe ratio greater than 1 on an annual basis, roughly 2.5x the Sharpe ratio of the equity market, with little correlation to passive asset-class benchmarks or standard pricing factors.
A Century of Evidence on Trend-Following Investing#
Brian K. Hurst, Yao Hua Ooi, and Lasse H. Pedersen (Journal of Portfolio Management, 2017) extend the trend-following evidence far beyond the modern futures era by constructing a time-series momentum strategy back to 1880. The paper asks whether the strong post-1985 performance of managed futures and trend-following strategies is just a recent lucky sample or a deeper empirical regularity that survives very different macro regimes.
Our summary: this is the long-horizon historical companion to Moskowitz, Ooi, and Pedersen. Its main value is not a novel signal, but historical durability. By stitching together much older data sources, it shows that trend following survives wars, inflation shocks, monetary regime changes, and long stretches where traditional assets behave very differently from the late-20th-century sample. That makes it one of the most important papers for anyone trying to argue that trend is a structural phenomenon rather than a short backtest artifact.
Key metrics: the authors construct a trend-following strategy back to 1880 and report that it remains consistently profitable over the next 110 years. The paper explicitly confirms that the post-1985 results from the modern multi-asset futures sample are not an isolated fluke.
Two Centuries of Trend Following#
Lempériere, C. Deremble, P. Seager, M. Potters, and J.-P. Bouchaud push the historical evidence even further by studying trend-following strategies across commodities, currencies, stock indices, and bonds using both futures data since 1960 and spot series that go back to 1800 for some assets. The paper’s framing is explicitly anomaly-focused: are long-horizon trend returns statistically strong enough to count as one of the major persistent irregularities in financial markets?
Our summary: this paper is the strongest statistical durability argument in the trend-following literature. The long sample lets the authors separate two issues that often get conflated: whether trends exist at all, and whether modern trend strategies have decayed. Their answer is nuanced: long trends remain statistically robust, but shorter trends have weakened in the recent era. The paper also documents signal saturation for very strong trends, consistent with the idea that fundamental traders only start leaning against price moves once the trend becomes extreme.
Key metrics: the paper reports an overall t-statistic of about 5 for excess returns since 1960 and about 10 since 1800, after accounting for the upward drift of markets. The authors describe the effect as stable across both time and asset classes.
Time Series Momentum and Volatility Scaling#
Abby Y. Kim, Yiuman Tse, and John K. Wald (Journal of Financial Markets, 2016) revisit the Moskowitz-Ooi-Pedersen result and ask how much of the observed alpha is really driven by the momentum signal itself versus the volatility-scaling overlay that accompanies the strategy. They compare volatility-scaled and unscaled versions of TSMOM and buy-and-hold portfolios at the contract, sector, and aggregate portfolio levels.
Our summary: this is the paper that forced the TSMOM literature to treat signal and position sizing as separate objects. The central result is uncomfortable but important: much of what looks like “momentum alpha” in managed futures-style backtests is actually the effect of volatility targeting or risk parity-style scaling. That does not make TSMOM useless, but it does mean any honest evaluation has to attribute performance carefully rather than crediting the sign signal for the whole package.
Key metrics: the paper finds that large TSMOM alphas are largely driven by volatility scaling. Unscaled TSMOM alphas look similar to unscaled buy-and-hold alphas, and scaled TSMOM alphas look similar to scaled buy-and-hold alphas, with the pattern holding at the individual-contract, sector, and portfolio levels.
Is Momentum Really Momentum?#
Robert Novy-Marx (Journal of Financial Economics, 2012) argues that the classic momentum effect is driven much more by intermediate-horizon returns than by very recent continuation. In his formulation, what investors usually call “momentum” often looks more like an echo from returns earned roughly 12 to 7 months before portfolio formation than a clean near-term persistence effect.
Our summary: this paper matters because it reframes momentum from a single monolithic anomaly into a term-structure question. If intermediate-horizon returns do the heavy lifting while the most recent month is polluted by reversal, then naive short-lookback momentum rules are badly specified from the start. That is exactly why this paper remains important background for anyone experimenting with very short-horizon trend signals: it explains why some “momentum” setups fail even when the broader anomaly is real.
Key metrics: Novy-Marx finds that strategies based on firms’ performance 12 to 7 months before formation are more profitable than strategies based on the recent 6 to 2 months, especially among the largest and most liquid stocks. He also shows that similar intermediate-horizon patterns appear in international equity indices, commodities, and currencies.
Time Series Momentum: Is It There?#
Dashan Huang, Jiangyuan Li, Liyao Wang, and Guofu Zhou (Journal of Financial Economics, 2020) reexamine the Moskowitz-Ooi-Pedersen evidence using the same broad futures data set but a more skeptical statistical lens. They argue that pooled regressions overstate the strength of TSMOM because of fixed-effect issues, persistent predictors, and the interaction between volatility scaling and heterogeneous asset means.
Our summary: this is the main empirical challenge paper in the TSMOM debate. The authors’ point is not that trend-following strategies cannot make money, but that the specific claim of statistically reliable return predictability from past 12-month returns is much weaker than it first appears once you examine assets individually and use bootstrap-adjusted inference. Even if one ultimately disagrees with the conclusion, this paper is essential because it raises the methodological bar for what counts as convincing TSMOM evidence.
Key metrics: in asset-by-asset regressions, 47 of 55 assets have t-statistics below 1.65 and only 3 assets deliver significant out-of-sample R^2 at the 10% level. In pooled regressions the headline t-statistic is 4.34, but the paper reports bootstrap critical values of 12.53 and 4.83, implying the pooled result is not statistically significant under their more conservative tests.
Risks and Returns of Cryptocurrency#
Yukun Liu and Aleh Tsyvinski (Review of Financial Studies, 2021) provide one of the foundational asset-pricing treatments of major cryptocurrencies. Studying Bitcoin, Ripple, and Ethereum, they show that crypto returns are largely distinct from stocks, currencies, and precious metals, and are better explained by crypto-native drivers such as user adoption, network activity, investor attention, and trend effects.
Our summary: this paper is the bridge between “crypto as a weird asset” and “crypto as an asset class that deserves its own factor language.” The most important takeaway is not merely that crypto is different, but that the relevant predictors are different in a systematic way. That makes this paper foundational for any crypto momentum or factor research stack: it says you should not expect stock-market factors to do most of the explanatory work.
Key metrics: the paper reports a strong time-series momentum effect in major cryptocurrencies and finds that investor-attention proxies significantly forecast future returns, while exposures to most common stock, currency, commodity, and macro factors are weak or absent.
Common Risk Factors in Cryptocurrency#
Yukun Liu, Aleh Tsyvinski, and Xi Wu develop a cryptocurrency factor model for the cross-section of crypto returns. Starting from a large universe of coins from 2014 to 2018, they construct crypto analogues of stock-market price and market-based factors and test whether a small number of them can explain the cross-section of expected returns.
Our summary: this is the canonical crypto factor-model paper. The headline result is that a three-factor model built from cryptocurrency market, size, and momentum factors explains most of the successful zero-investment strategies in the sample. For research design, the important lesson is that crypto needs its own factor architecture: stock-market factor models do not price these returns well, but crypto-native market, size, and momentum factors go much further.
Key metrics: 9 of 25 candidate crypto factors generate statistically significant long-short excess returns. The paper reports weekly excess returns of about 2.7%, 3.3%, 4.1%, and 2.5% for one-, two-, three-, and four-week momentum strategies, and shows that a crypto market-size-momentum three-factor model explains the excess returns of all nine successful strategies. Momentum among above-median-size coins reaches 4.2% weekly versus 0.6% for below-median-size coins.
Bitcoin Intraday Time-Series Momentum#
Dehua Shen, Andrew Urquhart, and Pengfei Wang study whether Bitcoin exhibits an intraday time-series momentum effect despite trading continuously around the clock. Because Bitcoin has no natural exchange open and close like equities, the paper defines effective trading sessions using volume spikes and then tests whether returns early in the trading day predict returns later in the same day.
Our summary: this is the closest academic match for sub-daily BTC trend-following questions. The paper shows that intraday momentum in Bitcoin is not just a stylized-fact curiosity; it has practical trading content and seems strongest when market activity is elevated. Equally important, the authors test competing mechanisms and conclude the effect is more consistent with liquidity provision, disposition effects, and aversion to overnight risk than with a simple late-information story.
Key metrics: the first half-hour of Bitcoin trading significantly predicts the last half-hour in both in-sample and out-of-sample tests. The effect is strongest during sessions with the highest trading volume or volatility, and the authors report substantial economic gains from intraday momentum-based trading, especially during Bitcoin downturns.
Value and Momentum Everywhere#
Clifford S. Asness, Tobias J. Moskowitz, and Lasse H. Pedersen (Journal of Finance, 2013) study value and momentum jointly across eight diverse markets and asset classes. Instead of treating value and momentum as stock-specific anomalies, they show that both premia appear broadly across global equities, equity index futures, government bonds, currencies, and commodities, and that their returns exhibit a strong common cross-asset factor structure.
Our summary: this is the canonical paper for understanding momentum as a global style rather than a niche equity effect. The most important finding is the correlation structure: value strategies are positively related to one another across markets, momentum strategies are also positively related across markets, and value and momentum are negatively related to each other both within and across asset classes. That makes the paper foundational for multi-style portfolio construction, because it explains why combining value and momentum is so powerful and why studying them jointly reveals global risks that disappear when each is examined in isolation.
Key metrics: the paper reports consistent value and momentum premia across all eight markets and asset classes studied and documents a diversified global value-plus-momentum portfolio with a high Sharpe ratio. It also shows that value and momentum returns correlate more strongly across asset classes than passive market exposures themselves, supporting the case for common global style factors.
Cryptocurrencies and Momentum#
Klaus Grobys and Niranjan Sapkota (Economics Letters, 2019) test whether the classic momentum anomaly survives in a broad cryptocurrency universe. Using monthly data on 143 cryptocurrencies from 2014 to 2018, they examine both cross-sectional and time-series-style momentum specifications motivated by the traditional asset-pricing literature.
Our summary: this is one of the clean negative-result papers in the crypto momentum literature, which is exactly why it is useful. Earlier crypto studies often worked with small samples, short windows, or the largest coins only, making it easy to overstate the anomaly. Grobys and Sapkota show that once the universe broadens, the classic momentum signal becomes much less convincing and in some specifications even turns negative. That makes this paper an important counterweight to the more bullish crypto-momentum evidence and a useful reminder that sample design matters enormously in digital assets.
Key metrics: across the 2014-2018 sample, the authors report no statistically significant evidence of profitable traditional momentum payoffs. For the broad 143-coin universe, winner-minus-loser returns are close to zero and generally insignificant; some trimmed-sample specifications are negative rather than positive.
A Factor Model for Cryptocurrency Returns#
Daniele Bianchi and Mykola Babiak develop a latent-factor model for cryptocurrency returns using Instrumented Principal Component Analysis (IPCA). Rather than relying only on a small set of hand-crafted observable factors, the model extracts latent risk drivers from a large cross-section of cryptocurrency pairs while allowing factor loadings to vary with observable characteristics such as liquidity, size, reversal, and downside risk.
Our summary: this paper is the natural next step after the early crypto factor literature. Whereas Liu-Tsyvinski-Wu establish that crypto has its own factor structure, Bianchi and Babiak ask whether a more flexible latent-factor model can describe that structure better than fixed observable factors. The answer is yes: crypto returns appear to have a richer, time-varying risk architecture than simple bottom-up factor portfolios can capture. For practitioners, the paper is especially interesting because it frames crypto risk premia as a dynamic latent-state problem rather than a static factor-zoo exercise.
Key metrics: the paper reports total and predictive R^2 of 17.2% and 2.9% for individual daily returns under the IPCA model, versus 9.6% and -0.02% for a benchmark six-factor observable model. The main drivers of expected returns are liquidity, size, reversal, and both market and downside risks. The results remain robust across individual assets, characteristic-sorted portfolios, pre- and post-COVID subsamples, and weekly data.
Dynamic Trading with Predictable Returns and Transaction Costs#
Nicolae B. Gârleanu and Lasse H. Pedersen (NBER Working Paper 15205, 2009; Journal of Finance, 2013) derive a closed-form optimal portfolio strategy when security returns are predictable and trading is costly. The optimal policy rests on two principles: “aim in front of the target” and “trade partially towards the current aim.” The updated portfolio combines existing holdings with an aim portfolio that is itself a weighted average of the current Markowitz-optimal allocation and expected future allocations, with slower-decaying predictors receiving more weight.
Our summary: this is the canonical paper linking alpha signals, transaction costs, and optimal rebalancing in a tractable multi-period framework. Its most important insight is that with trading costs the optimal response to a signal is never to jump to the frictionless target portfolio; instead the investor trades partially toward a moving aim that already looks past the current signal and anticipates where the portfolio should be once the signal decays. That reframes position sizing as a control problem rather than a single-period optimization and it is the standard reference for any modern execution-aware portfolio construction pipeline.
Key metrics: the authors derive the optimal trading rule in closed form and validate it on commodity futures, showing that the dynamic cost-aware policy produces better risk-adjusted returns than static Markowitz strategies that ignore transaction costs, as well as than static strategies that apply costs naively without considering predictor persistence.
Incorporating Signals into Optimal Trading#
Charles-Albert Lehalle and Eyal Neuman (Finance and Stochastics, 2019) extend the optimal execution literature to include exogenous Markovian predictive signals alongside transient market impact. Building on the framework of Gatheral, Schied, and Slynko, they prove existence of an optimal strategy when a trader must liquidate a position while reacting to a signal that itself follows a Markov process, and they derive explicit solutions for the case of an Ornstein-Uhlenbeck signal. The paper also shows that the model reduces to the Cartea-Jaimungal framework in certain limits.
Our summary: this paper is the natural companion to Gârleanu-Pedersen on the execution side. Rather than treating the trading schedule and the signal as separate problems, Lehalle and Neuman show that once you admit a predictive signal plus market-impact costs, the optimal trading rule is inherently continuous and state-dependent — there is no clean cutoff between “execution” and “alpha capture.” It is the standard reference for anyone building signal-aware execution engines and it explains why simple rules like “trade when signal > threshold” leave money on the table relative to a properly continuous controller.
Key metrics: the authors empirically validate the model on nine months of tick-by-tick data from 13 European stocks and show that order book imbalance has predictive power for future price moves and exhibits mean-reverting dynamics consistent with the Ornstein-Uhlenbeck assumption used in the theoretical solution.
On the Effect of Alpha Decay and Transaction Costs on the Multi-period Optimal Trading Strategy#
Chutian Ma and Paul Smith (2025) study portfolio optimization for a single asset with long and short positions where transaction costs make frequent rebalancing unattractive and historical signal values carry predictive content. To capture alpha decay explicitly, they frame the problem as an infinite-horizon Markov Decision Process and contribute a modified value iteration algorithm with a convergence proof, along with first-order approximations and asymptotic expansions for the optimal policy under small transaction costs.
Our summary: this is a recent refinement of the Gârleanu-Pedersen line of work, shifting the frame from closed-form linear-Gaussian solutions to dynamic programming under signal decay. The contribution is less a new economic insight than a set of rigorous tools for computing and approximating the optimal policy when the signal process is explicit. For practitioners this matters because alpha decay is almost always the binding constraint in post-cost portfolio design, and the asymptotic formulas let you reason about how aggressively to trade as a function of the decay rate without simulating the full MDP.
Key metrics: the paper establishes convergence of the modified value iteration scheme and derives first-order approximations valid in the small-cost regime, characterising the optimal trading policy in closed-form asymptotics rather than reporting live backtest numbers.
The Kelly Criterion and the Stock Market#
Edward O. Thorp presents a practitioner-oriented introduction to the Kelly criterion and its application to portfolio sizing in the stock market. The paper walks through the derivation of the log-optimal betting fraction, extends it from discrete bets to continuous-time and multi-asset settings, and discusses practical concerns such as finite horizons, drawdown behavior, and the effect of estimation error on the optimal leverage.
Our summary: this is the classic starting point for conviction-based sizing. Thorp’s pedagogical goal is to demystify why maximizing expected log wealth is the right long-run objective under very mild assumptions, and to show how the same logic applies whether you are sizing a blackjack bet or a long-short equity book. The most important lesson for a modern quant reader is not the headline formula but the discussion of fractional Kelly: even Thorp, who built his career on log-optimal sizing, argues that leverage below full Kelly is almost always the right practical choice once you factor in estimation error, non-normal returns, and drawdown tolerance.
Key metrics: the paper develops the Kelly formula and its fractional-Kelly variants analytically rather than from a backtest, and illustrates with worked examples how full-Kelly leverage interacts with expected growth rate, variance, and drawdown for realistic stock-market parameters.
Portfolio Choice and the Bayesian Kelly Criterion#
Sid Browne and Ward Whitt (Advances in Applied Probability, 1996) extend the Kelly log-optimal framework to a Bayesian setting where the investor is uncertain not only about future outcomes but also about the parameters of the return-generating process. They derive the optimal investment policy when the investor updates beliefs about unknown parameters over time and characterize how posterior uncertainty should translate into conservative deviations from the full-information Kelly bet.
Our summary: this paper is the formal bridge between Kelly sizing and parameter uncertainty. Its most important contribution is showing that leverage should reflect what you have actually learned, not just your current point estimate of edge — posterior variance directly dampens the optimal fraction invested. That is exactly the economic intuition behind modern fractional-Kelly practice, and it remains the cleanest reference for anyone who wants to justify under-betting relative to plug-in Kelly without appealing to heuristic safety margins.
Key metrics: the paper is analytical rather than empirical and characterizes the optimal Bayesian Kelly policy in closed form for canonical prior-likelihood pairs, showing that full-information Kelly is recovered only in the limit of infinite learning and that finite-sample optimal bets are strictly smaller.
Leverage and Uncertainty#
Mihail Turlakov (2016) studies how uncertainty beyond conventional risk should shape leverage decisions. The paper derives a fractional Kelly criterion for a single asset when returns have fat tails driven by rare, hard-to-quantify events, and then extends the analysis to multi-asset portfolios, showing how the Kelly lens provides a sharper interpretation of Risk Parity style allocations once fat-tailed asset classes are included.
Our summary: this paper motivates fractional Kelly as the natural response to model uncertainty rather than as an ad hoc safety factor. The key idea is that under fat tails and Knightian uncertainty, the optimal leverage falls well below plug-in Kelly, and the size of the haircut can be linked explicitly to the tail exponent and the degree of model confusion. For practitioners it is a useful complement to Browne-Whitt: Browne-Whitt justifies under-betting via posterior variance in a clean Bayesian world, while Turlakov justifies under-betting via tails and uncertainty in a dirtier, more realistic one.
Key metrics: the paper is primarily analytical and derives fractional-Kelly formulas under fat-tailed distributions and multi-asset Risk Parity-style constructions, rather than reporting empirical Sharpe or drawdown numbers.
Wasserstein-Kelly Portfolios#
Jonathan Yu-Meng Li (2023) proposes a robust version of Kelly portfolio optimization based on Wasserstein distributionally robust optimization. The investor chooses a log-optimal portfolio against the worst-case distribution within a Wasserstein ball around the empirical return distribution, turning the Kelly problem into a tractable convex program that is computationally efficient while explicitly addressing estimation error in the input distribution.
Our summary: this paper is a practical answer to the biggest weakness of Kelly sizing — sensitivity to the input distribution. Rather than asking the user to pick a fractional-Kelly multiplier or a tail exponent by hand, the Wasserstein-Kelly formulation lets the radius of the ambiguity ball do that work automatically, and the resulting program remains convex. It is a good modern reference for anyone who wants to keep the growth-optimal framing of Kelly but cannot accept how aggressively plug-in Kelly bets on noisy estimates.
Key metrics: empirical tests on financial data show that the Wasserstein-Kelly portfolio outperforms the traditional plug-in Kelly portfolio in out-of-sample evaluation across multiple metrics and delivers noticeably more stable allocations over time.
Volatility-Managed Portfolios#
Alan Moreira and Tyler Muir (Journal of Finance, 2017) show that managed portfolios which scale down exposure when volatility is high and scale up when volatility is low produce large alphas, higher Sharpe ratios, and large utility gains for mean-variance investors. They document the effect across the market factor, value, momentum, profitability, return on equity, investment, and betting-against-beta factors, as well as the currency carry trade. The mechanism is simple: changes in volatility are not offset by proportional changes in expected returns, so volatility timing adds value even without forecasting returns.
Our summary: this is the canonical reference for the idea that cutting risk in high-volatility regimes improves risk-adjusted performance almost universally across equity and macro factors. What makes the result powerful is its breadth — the same volatility-scaling overlay helps nearly every standard factor portfolio. For practitioners, the paper justifies the common CTA and risk-parity practice of volatility targeting as an ex-post optimal behaviour, and it is the natural companion to the Kim-Tse-Wald paper on TSMOM and volatility scaling, which asks how much of trend-following alpha is really the vol-scaling overlay rather than the direction signal.
Key metrics: volatility-managed versions of the market, value, momentum, and other factor portfolios produce statistically significant alphas relative to their unmanaged counterparts, with material improvements in Sharpe ratios. The effect is robust across the full sample (1926–2015 for equity factors) and across international markets and asset classes.
Dynamic Time Series Momentum of Cryptocurrencies#
Oliver Borgards (North American Journal of Economics and Finance, 2021) investigates momentum effects across twenty cryptocurrencies relative to US equities using a dynamic modelling framework that evaluates momentum periods following formation periods at both daily and intraday frequencies. The paper documents that formation periods frequently precede momentum periods across all time scales, and that cryptocurrencies exhibit notably stronger and longer momentum periods than equities.
Our summary: this paper provides strong evidence that time-series momentum in crypto is not only present but qualitatively different from the equity version — stronger, longer-lasting, and more exploitable. The author attributes the amplification to the difficulty of valuing intrinsic worth in crypto assets, which attracts more noise traders and delays mean-reversion. A momentum-based trading strategy outperforms buy-and-hold for both asset classes, but only the cryptocurrency strategy delivers superior risk-adjusted returns and diminished downside exposure. The identification of critical price thresholds where volatility spikes temporarily and triggers directional price movements is a useful practical detail for signal design.
Key metrics: the paper reports that momentum strategies outperform buy-and-hold for both cryptocurrencies and the stock market index, but only crypto momentum strategies generate higher risk-adjusted returns and lower downside exposure than a passive investment. Formation periods are followed by momentum periods at both daily and intraday frequencies across all twenty cryptocurrencies studied.
A Trend Factor for the Cross Section of Cryptocurrency Returns#
Christian Fieberg, Gerrit Liedtke, Thorsten Poddig, Thomas Walker, and Adam Zaremba (Journal of Financial and Quantitative Analysis) propose CTREND, a new trend factor for cryptocurrency returns that aggregates price and volume information across different time horizons. Using data on more than 3,000 coins and machine learning methods to exploit information from various technical indicators, they show that the resulting signal reliably predicts cryptocurrency returns, cannot be subsumed by known factors, and remains robust across different subperiods, market states, and alternative research designs.
Our summary: this is one of the most thorough cross-sectional crypto trend studies to date. The key practical finding is that the signal survives transaction costs and persists in big and liquid coins — answering the perennial concern that crypto momentum results depend on trading micro-caps that cannot actually be traded. By using ML to combine multiple technical indicators across horizons rather than relying on a single lookback, the paper also demonstrates that the crypto trend premium has a richer information structure than simple price momentum. An asset pricing model that incorporates CTREND outperforms competing factor models, making it a strong candidate for any crypto factor framework.
Key metrics: the CTREND factor generates statistically significant returns that survive transaction costs and remain robust in large and liquid coins. The effect persists across different subperiods and market states, and an asset pricing model incorporating CTREND provides a superior explanation of cryptocurrency returns compared to competing factor models.
Quantitative Evaluation of Volatility-Adaptive Trend-Following Models in Cryptocurrency Markets#
Ioannis Karassavidis, Lampros Kateris, and Maximos Ioannidis (SSRN, 2025) present a trend-following framework that adapts to volatility in high-frequency cryptocurrency markets, focusing on Bitcoin and Ethereum. The proposed system uses multi-horizon moving averages to detect trends, employs a multi-timeframe RSI filter to confirm momentum, and uses ATR scaling to manage risk based on current volatility, with trade exits combining ATR-based trailing stops and EMA slope reversal detection.
Our summary: this paper is notable for how closely its architecture maps to the kind of volatility-adaptive trend system that practitioners actually build. The combination of moving-average trend detection, RSI momentum confirmation, and ATR-based position sizing and exits is a design pattern that shows up repeatedly in production crypto trend-following systems. The paper evaluates these components quantitatively on BTC and ETH, addressing the extreme volatility, heavy-tailed return distributions, and structural inefficiencies that make crypto a natural but challenging test bed for momentum-based strategies.
Key metrics: the paper provides quantitative evaluation of the volatility-adaptive trend-following model on BTC and ETH, reporting performance across different configurations of the moving-average, RSI, and ATR components under realistic market conditions.
Demystifying Managed Futures#
Brian Hurst, Yao Hua Ooi, and Lasse Heje Pedersen decompose managed-futures returns to show that the category’s long-run performance can be explained to a surprisingly large degree by a simple, systematic time-series momentum rule applied across major asset classes. The paper reframes managed futures not as an opaque collection of discretionary macro bets, but as a transparent and largely replicable exposure to trend following plus a small amount of implementation-specific variation.
Our summary: this is one of the most useful bridge papers between academic TSMOM and practitioner CTA portfolios. Its importance is not that it proposes a new signal, but that it explains what managed futures funds have historically been doing in a way that can be tested, replicated, and compared with academic trend strategies. If you want to connect the “trend following works” literature to actual fund returns, this is one of the key references.
Key metrics: the paper reports that a simple trend-following strategy across 58 liquid futures and forwards explains a large share of managed-futures industry returns and reproduces much of the category’s diversification and crisis-performance characteristics. The replication is especially strong during major equity drawdowns, supporting the claim that trend following is a core driver of managed-futures convexity.
Rethinking Trend Following: Optimal Regime-Dependent Allocation#
Valeriy Zakamulin proposes a theoretical and empirical framework for trend following in which detected market regimes are treated as conditioning information and regime-specific exposures are chosen by directly maximizing the unconditional Sharpe ratio. Instead of taking standard time-series momentum and dynamic-speed momentum rules as fixed portfolio policies, the paper separates regime detection from position sizing and studies whether Sharpe-optimal allocation across Bull, Bear, Correction, and Rebound states improves out-of-sample performance. The empirical tests cover U.S. equity portfolios, 14 international equity markets, and 18 diversified portfolio datasets from Kenneth French’s data library.
Our summary: the key contribution here is not a new momentum signal but a clean argument that much of the literature hard-codes the economically important part of the strategy: how much to hold once a regime is identified. Zakamulin shows that canonical trend-following rules often overcommit to symmetric long/short exposures, especially full short positions in bear markets, and that much of the performance improvement comes from letting the exposure map itself be estimated from data. That makes this paper especially relevant for practitioners who already have regime labels or momentum signals and want a principled way to translate them into portfolio weights.
Key metrics: the paper reports annualized out-of-sample Sharpe ratios rather than profit, drawdown, or win-rate statistics. On the total U.S. equity market, the two-regime optimal-allocation strategy improves Sharpe from 0.494 to 0.727 over 2004-2025, while the four-regime version improves Sharpe from 0.507 to 0.735 versus Dynamic Speed Momentum. Across 14 international markets, average Sharpe rises from 0.054 to 0.295 in the two-regime comparison and from 0.192 to 0.319 in the four-regime comparison. Across 18 diversified Kenneth French portfolio datasets, average Sharpe improves from 0.208 to 0.506 in the two-regime case and from 0.496 to 0.628 in the four-regime case, with estimated bear-regime weights typically close to zero rather than -1.
Realized Volatility Forecasting with Neural Networks#
Andrea Bucci compares feed-forward and recurrent neural-network architectures for realized-volatility forecasting, with particular emphasis on LSTM and NARX models against more traditional econometric baselines. The paper argues that volatility is a natural fit for neural methods because the target exhibits both long memory and nonlinear dependence, and evaluates whether those properties can be captured more effectively by recurrent architectures than by classical linear or semiparametric alternatives.
Our summary: this is one of the cleaner early references showing that neural nets can be genuinely useful for volatility forecasting rather than just fashionable. The important point is not simply that an LSTM beats a benchmark once, but that architectures designed to retain long-range temporal information consistently help when the volatility process becomes more unstable. If you want a compact bridge from the HAR/GARCH world into sequence models, this is a good starting point.
Key metrics: the paper reports that recurrent neural networks outperform the traditional econometric competitors in realized-volatility forecasting, with LSTM and NARX delivering the strongest results. The advantage is especially visible in highly volatile periods, where modelling long-range dependence improves forecast accuracy further.
A Machine Learning Approach to Volatility Forecasting#
Kim Christensen, Mathias Siggaard, and Bezirgen Veliyev study realized-variance forecasting for Dow Jones Industrial Average constituents using a broad set of machine-learning methods, including regularized regressions, tree models, and neural networks, benchmarked against the HAR family. A key design choice is that the ML models are not given an unfair advantage through exhaustive tuning: the authors deliberately keep implementation simple and then ask whether generic ML still improves materially on the standard realized-volatility toolkit.
Our summary: the paper is useful because it shows that the gains from ML do not have to come from exotic feature sets or aggressive optimisation. Even with only daily, weekly, and monthly realized-variance lags, ML competes strongly and often wins, which suggests the main edge comes from learning a richer persistence structure than linear HAR models can express. The variable-importance discussion is also valuable because it pushes the paper beyond black-box forecasting into interpretation.
Key metrics: the authors report that ML beats the HAR lineage even when the predictor set is restricted to daily, weekly, and monthly realized-variance lags, and that the gains become larger at longer forecast horizons. They also find that ML is better at extracting incremental predictive information when richer predictor sets are introduced.
Volatility Forecasting with Machine Learning and Intraday Commonality#
Chao Zhang, Yihuang Zhang, Mihai Cucuringu, and Zhongmin Qian forecast intraday realized volatility by pooling information across stocks and explicitly exploiting commonality in intraday volatility together with a market-volatility proxy. The paper moves beyond single-series forecasting and tests whether cross-sectional structure can be used to train models that generalize not just within a stock, but across previously unseen names as well.
Our summary: this is one of the more practically interesting volatility papers because it treats volatility as a partially shared phenomenon rather than a completely separate process for each asset. That framing matters in production settings where you want a single model to transfer across a universe instead of maintaining one bespoke forecaster per stock. The additional intraday-to-daily setup is also useful if you care about pre-trade risk or transaction-cost analysis rather than only end-of-day portfolio reporting.
Key metrics: the paper reports that neural networks outperform linear regressions and tree-based models for intraday realized-volatility forecasting and that the gains remain robust when applied to stocks absent from the training set. The proposed methodology also beats strong out-of-sample baselines that rely only on past daily realized volatility.
HARNet: A Convolutional Neural Network for Realized Volatility Forecasting#
Rafael Reisenhofer, Xandro Bayer, and Nikolaus Hautsch introduce HARNet, a convolutional architecture designed as a structured deep-learning extension of the classic HAR model. The model uses a hierarchy of dilated convolutions so the receptive field grows quickly without a comparable blow-up in parameters, and it is initialized so that before training it reproduces the corresponding HAR forecast exactly, making the transition from econometric baseline to neural refinement unusually transparent.
Our summary: HARNet is compelling because it does not ask practitioners to abandon the HAR intuition that has worked well for realized-volatility forecasting. Instead it starts from HAR, then adds learnable nonlinear filters on top of the same multi-horizon logic. That makes it one of the better examples of a finance-aware deep architecture: the gains come from extending a sensible domain prior rather than replacing it with an unconstrained network.
Key metrics: across three stock-market indexes, the paper shows that HARNet improves meaningfully on its HAR baselines in realized-volatility forecasting. The authors also report that training with a QLIKE loss materially stabilizes optimisation, and their filter analysis finds that yesterday’s volatility contributes far more than older observations, with relevance decaying roughly linearly across prior weeks.
Code and data: the authors provide an official TensorFlow implementation at GitHub, including preset experiment configurations and a sample dataset (MAN_data.csv) sourced from the Oxford-Man Institute, so the paper is one of the easier volatility studies here to reproduce or extend.
DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions#
Fernando Moreno-Pino and Stefan Zohren propose DeepVol, a dilated-causal-convolution model that forecasts next-day volatility directly from raw high-frequency intraday data rather than relying on precomputed realized measures alone. The paper is motivated by the idea that once the data are compressed into handcrafted realized-volatility features, some predictive information may already have been discarded, especially in the shape and timing of intraday volatility bursts.
Our summary: this is the closest match to the “raw input” version of volatility forecasting. The main contribution is not just a different architecture, but the end-to-end stance that the model should learn directly from intraday sequences instead of from manually aggregated volatility proxies. That makes DeepVol especially relevant if you want to test whether representation learning can substitute for feature engineering in volatility work.
Key metrics: using two years of NASDAQ-100 intraday data, the paper reports that DeepVol delivers more accurate day-ahead volatility forecasts and more accurate risk measures than traditional benchmarks. The authors also find that a one-day receptive field with 5-minute sampling gives the best overall forecasting accuracy in their setup, while longer receptive fields make forecasts more conservative.
Comparing Deep Learning Models for the Task of Volatility Prediction Using Multivariate Data#
Wenbo Ge, Pooia Lalbakhsh, Leigh Isai, Artem Lensky, and Hanna Suominen compare several deep-learning forecasters for multivariate volatility prediction across five assets: the S&P 500, NASDAQ 100, gold, silver, and oil. The paper benchmarks multilayer perceptrons, recurrent networks, temporal convolutional networks, and the Temporal Fusion Transformer against GARCH-style baselines in a common forecasting setup, with the goal of understanding whether newer sequence architectures actually justify their added complexity.
Our summary: the value of this paper is comparative rather than methodological. It is useful when you already accept that deep models can forecast volatility and instead want to know which class of architecture tends to work best once you move from univariate to multivariate inputs. In that sense it serves as a practical model-selection paper, and it aligns with a broader recent pattern in time-series forecasting where TFT and TCN-style models often dominate plain recurrent baselines.
Key metrics: the study compares volatility forecasts for five major financial assets and reports that the Temporal Fusion Transformer delivers the strongest overall performance among the tested deep-learning models, with temporal convolutional architectures typically next best. The paper frames these gains in terms of improved forecast accuracy relative to both GARCH baselines and simpler neural alternatives.
Introducing NBEATSx to realized volatility forecasting#
Hugo Gobato Souto and Amir Moradi apply NBEATSx, a neural basis expansion model with exogenous variables, to daily stock realized-volatility forecasting over multiple horizons. They compare the architecture with LSTM, TCN, HAR, GARCH, and GJR-GARCH models across six stock indexes, three error measures, four statistical tests, and several robustness checks, making the paper unusually systematic for a single-model introduction.
Our summary: this is one of the stronger “modern forecasting architecture” papers in the volatility literature because the evaluation is broad and the results are reported in both accuracy and robustness terms. NBEATSx is attractive here not simply because it is newer than LSTM or TCN, but because its basis-expansion structure appears to work well when volatility persistence and exogenous effects both matter. The caveat from the paper is also useful: the gains are clearer in developed-market indexes than in developing-market ones.
Key metrics: across six stock indexes, NBEATSx produces statistically more accurate and more robust forecasts than the competing models. On average, the paper reports 13% and 8% better accuracy for medium- and long-term forecasts, plus robustness improvements of 43%, 60%, and 59% for short-, medium-, and long-term horizons respectively.
The Hybrid Forecast of S&P 500 Volatility ensembled from VIX, GARCH and LSTM models#
Natalia Roszyk and Robert Ślepaczuk compare four approaches to forecasting S&P 500 volatility: a standalone GARCH model, a standalone LSTM, a hybrid LSTM-GARCH specification, and a hybrid model that also incorporates the VIX. The paper uses daily S&P 500 and VIX data from January 3, 2000 through December 21, 2023 and asks whether combining a classical volatility model with a neural sequence model and an implied-volatility signal produces a more useful risk forecast.
Our summary: the paper is a good example of the hybrid direction in current volatility research. Rather than framing ML and econometrics as mutually exclusive choices, it treats them as complementary components: GARCH supplies a disciplined volatility structure, LSTM adds nonlinear sequence modelling, and VIX contributes a market-implied state variable. For practitioners, that is often a more realistic setup than trying to replace the entire econometric stack with a pure neural model.
Key metrics: over the 2000-2023 sample, the authors report that the hybrid LSTM models outperform the standalone GARCH benchmark, and that adding VIX further improves forecasting performance beyond the plain LSTM-GARCH combination. The comparison is based on one-step-ahead daily volatility forecasts over a long sample containing both calm and stressed market regimes.
Forecasting realized volatility in turbulent times using temporal fusion transformers#
Johannes Frank studies Temporal Fusion Transformers for realized-volatility forecasting on S&P 500 stocks during volatile market periods. The paper compares TFT forecasts with long short-term memory networks, random forests, and GARCH-style models, using weekly and monthly data, multiple feature sets, and different pooling approaches including sectoral pooling.
Our summary: this is one of the cleanest finance-specific TFT volatility papers because it asks the model-selection question directly: does TFT add value over established volatility models and simpler machine-learning alternatives? The most useful design detail is pooling. Instead of training each stock entirely separately, the paper shows that shared information across groups, especially sectors, can improve the volatility forecasts produced by ML models.
Key metrics: the paper finds that TFT performs very well for financial volatility forecasting and outperforms LSTM and random forest models when pooling methods are used. Sectoral pooling substantially improves predictive performance across the machine-learning approaches. The paper focuses on forecast accuracy rather than a trading strategy, so it does not report Sharpe ratio, annualized return, or maximum drawdown.
Code and data: the paper is available as FAU Discussion Paper in Economics 03/2023 through RePEc and EconStor. A public code package was not identified during this pass.
A GARCH-temporal fusion transformer model for the volatility prediction of exchange traded funds#
Lorenzo Petrosino, Luca Bacco, Giuliano Salvati, Mario Merone, and Marco Papi propose a hybrid volatility model that combines traditional GARCH econometrics with a Temporal Fusion Transformer. The study forecasts volatility for exchange traded funds composed of S&P 500 assets across sectors, using historical volatility and Garman-Klass volatility as target proxies.
Our summary: this is the strongest “econometrics plus TFT” volatility reference in the current batch. Instead of treating deep learning and GARCH as substitutes, it uses GARCH-derived structure alongside TFT’s multi-horizon covariate modeling. That makes it especially relevant for practitioners who already trust GARCH-style volatility pipelines but want to test whether attention-based neural forecasting can improve nonlinear and cross-feature behavior.
Key metrics: the paper reports that the hybrid GARCH-TFT model significantly outperforms alternative models when forecasting the Garman-Klass proxy and performs comparably to standalone TFT for historical volatility. The dataset covers January 1, 2002 to October 31, 2023 and is sourced from Yahoo Finance. The article evaluates forecast accuracy rather than a trading strategy, so it does not report Sharpe ratio, annualized return, or maximum drawdown.
Code and data: the authors cite a public companion repository at GitHub. The article page states that the market data can be obtained using Yahoo Finance.
Bimodality Everywhere: International Evidence of Deep Momentum#
Chulwoo Han and Chang Qin examine the prevalence of bimodality in momentum returns across 45 countries and evaluate Deep Momentum, a machine learning strategy designed to exploit the bimodal structure of momentum return distributions. The study finds that momentum portfolios do not produce normally distributed returns: past winners either soar or crash, with very few outcomes in between. This bimodal pattern is documented across a wide range of equity markets, is persistent over time, and is negatively correlated with momentum profits, individualism, and disposition effects, while positively correlated with turnover and volatility.
Our summary: this paper connects the well-known momentum crash literature to a concrete distributional explanation. Bimodality is not just a curiosity—it is a systematic property of momentum returns that varies across countries, and that variation predicts where traditional momentum will struggle. The Deep Momentum strategy, which conditions on the shape of the return distribution rather than just past returns, consistently outperforms both standard momentum and naïve machine learning approaches. The finding that a global model pooled across all 45 countries outperforms country-specific models is also practically important: it suggests that sharing information across markets is worth the complexity, particularly for countries with shorter data histories.
Data and code: the study covers equity markets across 45 countries. No public code repository was identified at the time of writing. The paper is available on SSRN.
Key metrics: Deep Momentum outperforms traditional momentum and naïve machine learning strategies consistently across the 45-country sample. Outperformance is strongest in countries with higher bimodality, larger sample sizes, and higher volatility. The global pooled model beats country-specific models for both global and country-level portfolio construction. Results are presented country-by-country rather than as a single aggregate Sharpe ratio or annualized return figure.
Mentioned by Ivan Blanco (@iblanco_finance) in this discussion: “Momentum stocks don’t just win or lose, they polarize. The distribution shape tells you more than the mean.”
The role of implied volatility in forecasting future realized volatility and jumps in foreign exchange, stock, and bond markets#
Thomas Busch, Bent Jesper Christensen, and Morten Ørregaard Nielsen examine whether model-free implied volatility derived from options prices adds forecasting information beyond the realized continuous and jump variation components already used in HAR-type models. The study covers three major asset classes—foreign exchange (EUR/USD, JPY/USD, GBP/USD), equity (S&P 500), and fixed income (30-year US Treasury bond)—and uses high-frequency intraday data to construct realized volatility, realized bipower variation, and the jump component.
Our summary: this is the canonical reference for HAR-X models that incorporate implied volatility. The central result is that implied volatility contains information about future realized volatility that is incremental to the continuous and jump components of past realized volatility, and this holds consistently across FX, equity, and bond markets. The paper also shows that implied volatility is informative about future jump activity, not just total realized variance. For practitioners building HAR-based volatility forecasters, this paper provides the direct empirical justification for including an IV term alongside standard HAR lags.
Data and code: the study uses high-frequency tick data for three FX rates, the S&P 500, and 30-year Treasury futures. No public code repository was identified. Published in Journal of Econometrics, Volume 160, Issue 1, 2011, pages 48–57.
Key metrics: implied volatility adds statistically significant incremental forecasting power for realized volatility across all five assets. The gains are strongest for the continuous variation component and for future jump activity. Results are reported in terms of out-of-sample R-squared and Mincer-Zarnowitz regression coefficients rather than trading performance metrics.
Advances in forecasting realized volatility: a review of methodologies#
Radmir Mishelevich Leushuis and Nicolai Petkov survey all models used to forecast realized volatility in the academic literature from 2000 to the first half of 2024. The review covers classical linear models (HAR and extensions), machine learning approaches, and deep learning architectures, comparing their empirical performance across markets and evaluation metrics.
Our summary: this is the most comprehensive recent survey of the realized volatility forecasting literature and a useful orientation for practitioners deciding which model class to test. The headline finding is that a hybrid CNN-LSTM tops the out-of-sample accuracy leaderboard, but the survey is careful to note that deep models do not universally dominate—HAR and its extensions remain strong baselines. The paper catalogs HAR extensions that have been shown to improve forecasts in specific contexts: signed and negative semivariance, jumps, implied volatility, overnight returns, realized volatility-of-volatility, and asymmetry and leverage effects. Results vary significantly by market, evaluation horizon, and loss function, and no single extension wins uniformly.
Data and code: the survey covers published academic results across multiple asset classes and markets. No single dataset or code repository; individual papers cited within should be consulted for reproduction details. Published in Financial Innovation, 2025.
Key metrics: the top-performing hybrid CNN-LSTM model outperforms all standalone linear, ML, and deep learning benchmarks in out-of-sample forecast accuracy as measured in the reviewed studies. HAR extensions involving semivariance, jumps, and implied volatility show the most consistent gains; overnight return and vol-of-vol extensions show more mixed results.
Transaction activity and bitcoin realized volatility#
Konstantinos Gkillas, Maria Tantoula, and Manolis Tzagarakis investigate whether Bitcoin transaction activity improves out-of-sample realized volatility forecasts beyond standard HAR and random forest models. The paper decomposes Bitcoin’s high-frequency realized volatility into its continuous and jump variation components and tests whether on-chain transaction flow data carries additional predictive information.
Our summary: this is a short but directly relevant paper for anyone building HAR-type Bitcoin volatility models with exogenous inputs. The main result is that transaction activity, combined with the jump component, improves out-of-sample forecasts in both the HAR and random forest frameworks. On-chain activity is a natural candidate because Bitcoin trading volume and transaction counts are publicly available at high frequency and capture information about market participation that is not embedded in price-based realized volatility measures alone.
Data and code: Bitcoin high-frequency price data and on-chain transaction data. Published in Operations Research Letters, Volume 49, Issue 5, 2021, pages 715–719 (DOI: 10.1016/j.orl.2021.06.016).
Key metrics: incorporating transaction activity and jump variation improves out-of-sample forecast accuracy for Bitcoin realized volatility relative to baseline HAR and RF models. Results are reported using standard forecast evaluation metrics; no Sharpe ratio or drawdown metrics are reported as the paper focuses on volatility forecasting accuracy rather than a trading strategy.
Forecasting Bitcoin realized volatility by measuring the spillover effect among cryptocurrencies#
Yue Qiu, Yifan Wang, and Tian Xie study whether cross-cryptocurrency spillover variables improve HAR-type realized volatility forecasts for Bitcoin. The paper constructs spillover measures from Ethereum, Litecoin, and other major cryptocurrencies and embeds them as exogenous regressors in an augmented HAR framework.
Our summary: the paper addresses a feature that is specific to crypto markets: Bitcoin’s realized volatility is not independent of the broader crypto ecosystem. When major altcoins move sharply, Bitcoin volatility often follows, and the authors show that this spillover information is statistically significant and improves out-of-sample forecasts. For practitioners running Bitcoin volatility models, this suggests that cross-asset realized volatility from correlated cryptocurrencies is a more natural exogenous input than general macro or equity spillovers, because the timing and magnitude of crypto co-movements are tightly coupled.
Data and code: high-frequency price data for Bitcoin and multiple altcoins. Published in Economics Letters, Volume 208, November 2021 (DOI: 10.1016/j.econlet.2021.110092).
Key metrics: the HAR model augmented with cross-cryptocurrency spillover variables outperforms the standard HAR benchmark in out-of-sample Bitcoin realized volatility forecasting. Results are reported using mean squared error and related accuracy metrics; no trading strategy performance metrics are reported.
Assessing the Risk of Bitcoin Futures Market: New Evidence#
Anupam Dutta proposes an augmented HAR model for forecasting the realized volatility of Bitcoin futures that incorporates time-varying jump intensity estimated through a GARCH-jump process. The paper tests whether jump-induced volatility carries information about future realized variance beyond what is already captured by the continuous path component and the Bitcoin implied volatility index.
Our summary: the paper is notable for two reasons. First, it provides one of the cleaner demonstrations that jump volatility and leverage effects add predictive information to HAR-RV for Bitcoin futures specifically—a crypto context where jumps are frequent and large. Second, the novel finding that jump-induced volatility is informative incrementally to Bitcoin’s own IV index is practically important: it means that even after conditioning on the market’s forward-looking volatility expectation, high-frequency jump realizations still contain residual information. For HAR practitioners building Bitcoin volatility models, this supports including a jump component in addition to IV.
Data and code: Bitcoin futures high-frequency price data and the Bitcoin implied volatility index. Published in Annals of Data Science, 2024 (DOI: 10.1007/s40745-024-00517-4).
Key metrics: the augmented HAR model with GARCH-jump-derived volatility and leverage effects outperforms standard HAR-type models in both in-sample and out-of-sample analyses. Jump-induced volatility provides incremental out-of-sample predictive power relative to the Bitcoin IV index. Results are reported in forecast accuracy terms; no annualized return or Sharpe metrics are reported.
GNAR-HARX Models for Realised Volatility: Incorporating Exogenous Predictors and Network Effects#
Tom Ó Nualláin introduces a hybrid modelling framework that combines Generalised Network Autoregressive (GNAR) structure with Heterogeneous Autoregressive (HAR) dynamics and exogenous predictors. The model forecasts realized volatility across ten international stock indices using approximately 16 years of daily data from 2005 to 2020, evaluating performance under both QLIKE loss and mean squared error.
Our summary: this paper is a direct empirical warning about the metric-dependency of exogenous variable results in HAR-type models—a pattern that maps closely onto practitioner experience. The best QLIKE model is a local GNAR-HAR without any exogenous variables, while the best MSE model is a GNAR-HARX that includes implied volatility. Exogenous variables help on mean squared error while hurting or not improving on QLIKE, which is exactly the split seen when broader feature packs are added to HAR-type models in crypto and equity contexts. The network structure finding—that fully connected cross-market graphs outperform dynamically estimated graphical lasso networks—is also useful for anyone attempting cross-asset spillover modelling.
Data and code: daily realized volatility for ten international stock indices, 2005–2020. The paper is available on arXiv (arXiv:2510.24443). Code availability not confirmed at time of writing.
Key metrics: the local GNAR-HAR without exogenous variables achieves the lowest QLIKE score. The GNAR-HARX with implied volatility achieves the lowest MSE. GNAR-HAR(X) specifications outperform univariate HAR(X) benchmarks across both metrics. Fully connected networks consistently outperform graphical lasso networks. No trading strategy performance metrics are reported.
Is Return Seasonality Due to Risk or Mispricing? Evidence from Anomaly Seasonality#
Jiaqi Wang reexamines return seasonality in anomaly portfolios using 125 anomaly long-short portfolios and 1,238 corresponding raw portfolios constructed from U.S. equities. The central question is whether the well-documented seasonal patterns in anomaly returns—where certain factors tend to perform well in the same calendar months year after year—reflect genuine time-varying risk premiums or are mechanically inherited from stock-level seasonality through characteristic-based portfolio construction.
Our summary: the paper’s main contribution is demonstrating an asymmetric spanning relationship that resolves the interpretation of anomaly seasonality. A seasonality factor constructed directly from individual stocks fully spans the seasonality present in anomaly portfolios, but anomaly-based seasonality factors cannot price seasonality strategies in individual stocks. This means the causation runs from stocks to anomalies, not the other way around: when you sort stocks into portfolios based on characteristics like value or momentum, the seasonal patterns of the individual stocks carry through into the portfolio returns, creating the illusion that the anomaly itself has seasonal risk premiums. For practitioners, this is an important diagnostic: if a factor strategy shows strong calendar-month patterns, those patterns may not represent a tradeable seasonal edge in the factor itself but rather a mechanical artefact of which stocks happen to load on that factor. Before adding a seasonal timing overlay to a factor strategy, one should check whether the seasonality survives controls for stock-level seasonal effects.
Data and code: U.S. equity data covering 125 anomaly long-short portfolios and 1,238 raw portfolios. The paper is 72 pages and available on SSRN. No public code repository was identified at time of writing.
Key metrics: the raw seasonality strategy generates 1.45% monthly alpha. After controlling for anomaly-based factors, alpha drops to 0.86–1.23%. After controlling for a proper stock-level seasonality factor, alpha collapses to 0.27%, indicating that most of what appears to be anomaly seasonality is subsumed by individual stock seasonality. Significant seasonality at annual lags is found across anomaly returns, but the spanning tests show this is inherited rather than intrinsic.
Mentioned by Ivan Blanco (@iblanco_finance) in this discussion: “Most anomalies are just seasonality in disguise. A lot of what the literature calls anomaly seems to be the same calendar effect wearing different costumes.”
Can Safety and Profitability Coexist? Performance Analysis of Pairs Trading among S&P 500 Stocks#
A Master’s thesis from the University of Gothenburg (Spring 2025) that examines cointegration-based pairs trading on S&P 500 constituents from 2005 to 2024, with a particular focus on how trading volume drives strategy performance. The methodology uses the Johansen cointegration test to identify statistically related pairs, generates z-score-based entry signals (entry at |z| > 2, exit at |z| < 0.5), and evaluates 270 distinct parameter combinations across lookback periods, holding periods, entry/exit thresholds, and stop-loss settings. The universe is segmented into volume deciles to isolate the effect of liquidity on pairs trading profitability, and performance is validated with a proper train/test split (2005-2014 training, 2015-2024 testing).
The central finding is that trading volume is the dominant driver of pairs trading success. High-volume pairs consistently outperform low-volume pairs across all tested configurations, all weighting methods, and both time periods. The optimised high-volume strategy uses a 2-year lookback, 1-month maximum holding period, entry z-score of 1, exit z-score of 0.1, and a tight -2.5% stop-loss. This configuration achieves a Sharpe ratio of 2.73 in the out-of-sample testing period (2015-2024), with 18.52% annualised return, 80.83% win rate, and a maximum drawdown of only 1.24%. The CAPM analysis reveals a statistically significant annualised alpha of 5.28% with negative market beta (-0.18), confirming the strategy’s countercyclical nature — it actually generates positive returns during market downturns including the 2008 financial crisis and the 2020 COVID crash. The Fama-French five-factor model confirms a significant alpha of approximately 5% annually even after controlling for all five factors. Contrary to efficient market expectations, strategy performance improved in the recent period (2015-2024) compared to the earlier period (2005-2014), suggesting that structural market changes or institutional constraints have preserved the strategy’s edge despite widespread documentation. Transaction cost analysis shows the strategy remains profitable under institutional-level costs (0.3% borrowing, 0.2% commissions) with a Sharpe ratio of 2.67, though performance degrades significantly under retail-level costs (3% borrowing, 2% commissions), where the Sharpe still remains above 2.0 at 2.12. Trading commissions impact performance disproportionately more than borrowing costs due to the high trade frequency. Five weighting methodologies (equal, value, risk parity, half-life, volume) were tested and all produced Sharpe ratios above 2.0 for high-volume pairs, with risk parity marginally outperforming and equal weighting offering the best balance of simplicity and effectiveness. Notably, stop-loss exits constituted 60-67% of all trades, yet the strategy remained highly profitable because the winning trades (target convergence and end-of-period exits) generated sufficiently high returns to compensate.
Data and code: daily stock price data from CRSP database for S&P 500 constituents, with market benchmarking data from Kenneth French’s Data Library. Survivorship bias is mitigated by including stocks only during their S&P 500 membership periods. No public code repository was identified. The thesis is 52 pages including appendices and is freely available from the University of Gothenburg’s institutional repository (GUPEA).
Key metrics (optimised high-volume pairs, out-of-sample 2015-2024): annualised return 18.52%, annualised standard deviation 6.79%, Sharpe ratio 2.73, win rate 80.83%, maximum drawdown 1.24%, cumulative return 515%. Under low transaction costs: annualised return 18.10%, Sharpe 2.67. Under high transaction costs: annualised return 14.37%, Sharpe 2.12. Base case (unoptimised, full universe): annualised return 4.11%, Sharpe 0.69, max drawdown 11.27%. CAPM alpha 5.28% (statistically significant), market beta -0.18.
Mentioned by Nam Nguyen, Ph.D. (@namnguyento) in this discussion: “The paper revisits the age-old strategy of pairs trading and incorporates volume analysis into the framework. Trading volume is a dominant performance driver, with high-volume pairs consistently outperforming low-volume counterparts.”