Research note

Does factor similarity predict returns? We ran the leak-free test.

We tested our own dataset the way a skeptic would — and we're publishing the result whether it flatters the product or not. The short version: factor similarity does not forecast stock returns. It does something else, something real, and that distinction is the whole point of this note.

The findings, up front

1 · What's in the dataset

Factor Weave is a point-in-time factor library for US-listed equities and ETFs — roughly 10,800 tickers, daily, back to 2005 (about 28 million ticker-days). Every row is built to be joined to your own research, not consumed as a verdict.

28 daily factors
Trailing returns (1/5/20/60d), realized vol (20/60d), ATR%, RSI, 52-week z-score, momentum, mean-reversion, gap%, beta vs SPY, a composite score, plus cross-sectional ranks and quantiles.
32-D regime-aware embeddings
A vector per ticker-day summarising its factor state, with four similarity methods built on top — cosine, DTW, regime-filtered, and a return-projected (PLS) variant.
Leak-free forward-return labels
fwd_ret_1d / 5d / 20d and binary targets — each a strict price[t+h]/price[t]−1, the realised future. They are a backtest target, never an input.
Market-regime classification
Each date carries a SPY-volatility regime (low / mid / high) so any study can be split by market state.

2 · The test, and why it's leak-free

The question: if you find the historical setups whose factor profile most resembles a ticker today, does the average forward return of those analogues predict the ticker's forward return? We tested it with discipline:

As a sanity check every test includes a random-analogue baseline — picking K analogues at random. A correct measurement puts random at zero. Ours does.

3 · Result — returns are not predictable

Method What it is Cross-sec. IC t-stat Verdict
Random baseline random analogues +0.003 +0.5 ≈ 0 ✓
Cosine similarity nearest factor-profile analogues −0.005 −0.8 no signal
Supervised (PLS) return-projected similarity +0.005 +0.8 no signal
Gradient boosting nonlinear, 26 factors, walk-forward −0.000 −0.0 no signal

Three different methods — a linear similarity, a return-supervised projection, and a nonlinear model with the full factor set — and the same answer each time: a cross-sectional IC indistinguishable from zero, and from random. The return-projected variant's mildly rising outcome buckets (a +0.70% top-vs-bottom spread) do not survive the t-test; on 237 dates it is noise.

This is not a surprise, and that's important. These are commodity technical factors on liquid US equities — the most heavily arbitraged corner of global markets. The academic and practitioner consensus is that such signals are competed flat. Our data agrees. A vendor who tells you otherwise is showing you a backtest with a leak in it.

4 · Result — but similarity is risk-coherent

Returns are a coin flip. Volatility is not — it clusters and persists. So we asked a second, fairer question: do tickers the engine calls "similar" share a forward risk profile? We measured whether cosine analogues' forward 20-day realized volatility predicts the query's, same leak-free design.

Predictor of forward 20-day volatility Cross-sec. IC t-stat
Random analogues' forward vol +0.005 +0.8
Cosine analogues' forward vol +0.062 +8.3
The ticker's own trailing vol +0.830 +191.6

Cosine analogues carry real, strongly significant forward-risk information — twelve times the random baseline, t +8.3. The embeddings genuinely organise the universe by risk regime: "similar" means "similar risk profile."

We'll be just as straight about the limit. A ticker's own trailing volatility predicts its forward volatility far better (IC +0.83). So this is not a volatility-forecasting product — if you want one stock's vol, use its own history. What the result establishes is subtler and more useful: the similarity engine is coherent. It is not noise. When it groups setups, it is grouping them by a property that genuinely carries forward — which is exactly what makes it a sound tool for screening, peer sets, and regime-aware research, and exactly why it is not a return oracle.

5 · What to use it for

A good fit
  • Screening — find tickers in a given factor or risk state
  • Peer sets & substitutes — "what else looks like this?"
  • Regime-aware research — split any study by market state
  • Assembling clean, leak-free backtest datasets to test your own signal
  • Conversational research via the MCP server
Not what it does
  • Predict which stocks go up — the data does not contain that
  • Replace your alpha model — it is the substrate, not the signal
  • Forecast a single stock's volatility — its own history wins

Reproduce it

The probes behind every number here are in the repository under scripts/diagnostics/signal_probe_extended.py (similarity vs returns), signal_probe_gbm.py (the gradient-boosted model), and signal_probe_vol.py (the risk-coherence test). Leak-free by construction; run them yourself.

Honest data, honestly described

A free account gives you all 28 factors, 252 days of point-in-time history, four similarity methods, forward-return labels, the REST API and the MCP server.

Create a free account

Or explore the data first — no signup needed.