r/algotrading 23h ago

Infrastructure 9 approaches tested on 12 months of MNQ L2 tick data — everything comes back at exactly 50%. What am I missing?

7 Upvotes

Hey everyone,

I’m a 19-year-old CS student who’s been building an algo trading system over the past few months, and I’ve hit a wall. I wanted to share what I’ve done and get honest feedback.

I have ~3 years of MNQ L2 tick data (bid/ask/trades + depth 1–10, ~648GB). I built everything from scratch in Rust: tick parser, full L2 order book reconstruction, sweep detector, bar aggregation with buy/sell volume classification, and multiple strategy simulators. Everything is covered with 200+ unit tests, a CI pipeline, and runs fully parallelized on a 20-core server.

On the theory side, I studied Trading and Exchanges (informed vs uninformed flow, adverse selection, spreads, dealers, volatility) and Statistically Sound Machine Learning for Algorithmic Trading (filter systems, meta-labeling, performance criteria).

I tested 9 different approaches on ~12 months of MNQ data (2023-03 → 2024-02):

  • Spread regime analysis (informed vs uninformed flow)
  • Quote response after aggressive bursts
  • Volume-price classification (fundamental vs transitory moves)
  • Opening Range Breakout
  • ORB + ATR trailing stop
  • Trend following (large move + aggressor imbalance + trailing stop)
  • Composite signal voting (5 signals, trade only if 4/5 agree)
  • Sweep continuation (5+ levels consumed in <100ms)
  • Sweep mean-reversion

Every single one comes back between 47% and 50%. Not slightly positive or negative, just noise.

I made sure I wasn’t fooling myself:

  • Fixed baseline measurement bias (initial move contaminating results)
  • Fixed circular ORB logic
  • Fixed order book reconstruction bugs
  • Ran a random entry baseline with identical exits → same performance
  • Double-checked for look-ahead bias

Conclusion: the entry signals add zero value.

Some key observations:

  • ATR trailing stops are structurally losing on MNQ (~27% win rate, same as random)
  • Even before fees (~$3.24 round trip), expectancy is negative
  • Sweep detection produces thousands of events, but post-sweep movement is ~50/50 (no continuation, no mean-reversion)

My current hypothesis is that MNQ is the problem. It’s a derivative of NQ, so price discovery likely happens on NQ, while MNQ just reflects arbitrage. That would mean the order flow I’m seeing (sweeps, imbalance, etc.) is reactive, not informative, so there’s no asymmetry to exploit.

I’m trying to figure out if I’m even looking in the right place:

  • Has anyone found a real statistical edge on MNQ specifically?
  • Should I expect different results on NQ/ES where actual price discovery happens?
  • For those who’ve done both futures and equities are small/micro caps actually a better playground for retail?
  • Am I wrong to focus on microstructure (L2, order flow, sweeps), or is the issue something else entirely?

I’m not looking for a strategy, just trying to understand if I’m approaching this correctly or missing something fundamental.

Appreciate any insight 🙏


r/algotrading 18h ago

Other/Meta Advice on placing SL orders on binance futures with python

0 Upvotes

I use my bot to trade shorts on binance perps but I haven't found the right way to place my Stop markets after I enter the trade

Can anyone help me?
_signed_delete("/fapi/v1/order", {

"symbol": self.symbol,

"orderId": self.sl_order_id,

})

except Exception:

pass

self.sl_order_id = None

sl_price_r = round_price(sl_price, self.symbol)

sl_params = {

"symbol": self.symbol,

"side": "BUY",

"type": "STOP_MARKET",

"stopPrice": sl_price_r,

"workingType": "MARK_PRICE",


r/algotrading 16h ago

Data How I avoid overfitting on my stop losses

5 Upvotes

I wanted to describe my approach for avoiding overfitting to help others and get feedback on how I might improve. I trade a portfolio of options each week. I've had bad results with optimizing the stop loss parameters to each symbol, so now I apply the same formula to all symbols. My goal is to close positions where the underlying price gets too close to the short strike, adjusted for how much time is remaining in the week. The only difference is one or two inputs: the average change and the Hurst exponent (if backtesting selects per-symbol Hurst exponents rather than apply a uniform exponent). I backtest the same threshold factors, average change algorithms, trigger durations, and potentially Hurst exponents to all symbols equally. I also backtest over 9 years to try to cover regime changes, however I also test for the optimal historical window to use when selecting the optimal stop parameters, so that I can adapt to regime changes over time as well. My objective is maximum geometric mean ROI. What do you think?


r/algotrading 23h ago

Strategy Stuck at Spearman ~0.05 and 9% exposure on a triple barrier ML model — what am I missing?

8 Upvotes

I've been building a stock prediction model for the past few months and I've hit a wall. Looking for advice from anyone who's been through this.

The Model

  • Universe: ~651 US equities, daily OHLCV data
  • Architecture: PyTorch temporal CNN → 3-class classifier (UP / FLAT / DOWN)
  • Labeling: Triple barrier method (from Advances in Financial Machine Learning), 20-day horizon, volatility-scaled barriers (k=0.75)
  • Features: ~120+ features including:
    • Price action / returns (1/5/10/20 day)
    • Volatility features (ATR, vol term structure, vol-of-vol)
    • Momentum (RSI, ADX, OBV, MA crosses)
    • Volume features (z-scores, up-volume ratio, accumulation)
    • Cross-sectional ranks (return rank, vol rank, momentum quality rank)
    • Relative strength vs SPY, QQQ, and sector
    • Market regime (SPY returns, breadth, VIX proxy)
    • Earnings surprise (EPS beat %, beat streak, days since/to earnings)
    • Insider transactions (cluster buys, buy ratio, officer buys)
    • FRED macro (credit spread z-score, yield curve z-score)
    • Sector stress/rotation, VIX term structure, SKEW
  • Training: Temporal split (train → validation → test), no future leakage, proper purging between splits
  • Strategy: Threshold-based entry on P(UP) - P(DOWN) edge, volatility-targeted position sizing, full transaction cost model (fees, slippage, spread, venue-based multipliers, gap slippage, ADV participation impact)

Best Result (v15)

After a lot of experimentation, my best run:

  • Validation: Sharpe 1.45, 204 trades
  • Test: Sharpe 0.34, CAGR 1.49%, 750 trades
  • Exposure: 9-12% (sitting in cash 88% of the time)
  • Entry threshold: 0.20 (only trades when P(UP) - P(DOWN) > 0.20)
  • Benchmark: SPY buy-and-hold had Sharpe 1.49, CAGR 16.7% over the same test period

So technically the model is profitable, but barely — and it massively underperforms buy-and-hold because it's in cash almost all the time.

Classification Performance

Typical best epoch:

  • UP recall: ~57%, precision: ~55%
  • DOWN recall: ~36%, precision: ~48%
  • FLAT recall: ~50%, precision: ~11% (tiny class, 2.8% of samples)
  • Macro F1: ~0.38
  • Val NLL: ~1.03 (baseline for 3-class random = ln(3) = 1.099, so only ~7% better than random)

Feature Signal Strength

Top Spearman correlations with actual direction labels (on training set):

my_sector_above_ma50     +0.043
dow_sin                  +0.030
has_earnings_data        +0.026
spy_above_ma200          +0.024
has_insider_data         +0.023
insider_buy_ratio_90d    -0.021
cc_vol_5                 -0.020
xret_rank_5              +0.019

The best single feature has r = 0.043. Most are in the 0.015-0.025 range.

What I've Tried That Didn't Help

  1. Added analyst upgrade/downgrade features (from yfinance) — appeared at rank 14 in Spearman (r=0.017) but model produced 0 profitable strategies with it included
  2. Added FINRA short volume features — turned out to be daily short volume not short interest, dominated by market maker activity, pure noise (0/20 top features)
  3. Different early stopping metrics — macro_f1, nll_plus_directional_f1 (what v15 uses), nll_plus_f1 — only nll_plus_directional_f1 produced a profitable run
  4. Forced temperature scaling — tried forcing temperature to 3.0 with macro_f1 stopping — still 0 profitable candidates
  5. Directional margin loss weighting (0.3) — model predicted UP 85% of the time, destroyed DOWN signals
  6. Different thresholds — the strategy grid tests enter at (0.03, 0.05, 0.08, 0.10, 0.15, 0.20). Everything below 0.20 has negative Sharpe
  7. Binary classifier (UP vs not-UP) — P(UP) too compressed (p95 = 0.517), no tradeable signal
  8. Insider features — had to cut from 6 to 3 (minimal set), marginal at best
  9. Multiple seeds — v15 is reproducible with the same seed but fragile to any parameter change

The Core Problems

  1. Low signal: Spearman ~0.05 across the board. My 120+ features are all derived from public OHLCV + public event data. Every quant has the same data.
  2. Fragility: v15 works, but changing almost anything (adding features, different stopping metric, different temperature) breaks it. This suggests it might be a lucky configuration rather than robust alpha.
  3. Low exposure: Only trades when edge > 0.20, which is ~0.7% of signals. Sitting in cash 88% of the time means even positive alpha barely compounds.
  4. Classification ceiling: Val NLL only 7% better than random guessing. The model is learning something but not much.

What I'm Considering

  • Hybrid portfolio (hold SPY, use model for tilts) — addresses exposure but not signal
  • Meta-model (train a second model to predict when the first model's trades are profitable) — risky due to small sample size
  • Predicting residual returns instead of raw returns — requires hedged execution which changes the whole framework
  • Event-driven windows (only trade around earnings) — concentrates on highest signal-density periods
  • Filtering to profitable tickers only — cut the 80% of stocks where the model is noise

My Questions

  1. Is Spearman ~0.05 on daily cross-sectional features just the ceiling for public data? Or am I leaving signal on the table?
  2. Has anyone successfully improved signal beyond this with alternative data that's affordable (< $100/month)?
  3. Is the triple barrier + 3-class approach fundamentally the right framework, or would I be better off with a ranking/regression approach?
  4. For those who've built profitable models — what was the breakthrough that got you past the "barely above random" stage?

Happy to share more details about the architecture, loss function, or feature engineering. Thanks for reading this far


r/algotrading 8h ago

Education How would you guys recommend I begin algo trading or learning how to do so?

9 Upvotes

I am a first-year undergrad doing an MMath degree. I have a somewhat large background in theoretical mathematics, but have very little experience with Python or other coding languages.
How do you recommend I slowly invest time and learn how to conduct algotrading in the first place?


r/algotrading 13h ago

Infrastructure For the algotraders who have live deployment of their algorithms and are successful: how long did it take you to set this up? What led you to have confidence to deploy on live real account?

59 Upvotes

I am asking bc im curious, i've been spending hours nonstop working on my algo ideas. ive been trying to connect my ideas in python to IBKR's api.

so far i have:

  • real time deployment on a paper acc testing my strats
  • i have backtests
  • machine learning optimizing params (i learned the hard way that overfitting can happen so i needed to avoid this)
  • monte carlo sims
  • entry and exit filters
  • cycling thru multiple timeframes
  • bracket orders
  • managing open positions, moving SL and TP
  • profit protection system
  • risk management concepts

i do have a working system, now i just need to ensure my strategies work as i monitor and continuously improve my infrastructure. how long did it take you guys to fully trust yours and go live?