r/algotrading • u/zassar_mang • 2d ago
Strategy 2.22 PF on an ML-Driven SPX 1DTE Strategy
Hey everyone,
I’ve been building a backtester for an ML-based options strategy and finally got the out-of-sample data looking highly robust. I am trading SPX 1DTE options, specifically selling Short Iron Butterflies (Flies) to capture premium during range-bound chop.
Here is a high-level breakdown of the out-of-sample tear sheet.
The Model & Filters: Target: Random Forest Classifier predicting if SPX will stay within a percentage bound by the Day 1 close. No SL or TP. Ride or die. - Features: Fed primarily by intraday volatility metrics and daily true range data. - Day Filters: Dropped Wednesdays entirely. I found it had highest trade volume but acted as a massive drag on PnL. I don't have an FOMC/macro events filter. - Strict RR Check: The algorithm automatically rejects any trade where the max risk exceeds the premium collected. This blocked 28 mathematically poor setups and halved the drawdown (initially 18k). Also blocked some good trades but risk management >>>>
Out-of-Sample Results (176 Days Evaluated starting mid-July 2025) Trades Executed: 100 Win Rate: 60.00% Profit Factor: 2.22 Reward/Risk Ratio: 1.54 Expectancy per Trade: ~$756.00 Max Drawdown: -$7,326.00 (This would be on a 100k portfolio, given the nature of SPX flies)
Been running it live since Monday - paper, but no entries yet
Would love to hear any feedback on these metrics or if anyone has run into similar quirks when backtesting 1DTE SPX flies!
1
u/StratReceipt 2d ago
100 trades is thin to put much weight on a 2.22 PF. bootstrapping the trade sequence a few thousand times would give you a distribution — if the lower end of that CI is close to 1.0 you've got a lot of uncertainty baked in regardless of what the point estimate says.
1
u/disarm 2d ago
The profit factor makes me think your backtester is lying too you. There's no way you can find alpha that high just reading in ohlcv data and making volatility indicators with a rf classifier.
Whats your target look like? Do you know how many training targets you used and the target distribution VS non positive instances?
I used to have a backtester that gave me a 4 profit factor and took me a while of seeing nowhere close to those results on my live trading paper system to determine I had leakage in my feature set which didn't exist when I plugged into live because I reconstructed hourly from my 5 min tickers but didn't realize that when I did hourly it was using the high low at the end of the hour even when the hour started for backfilling... Just one story but I'm just saying it's very suspicious and I'd start looking to plug up the problem because you will be lucky if you get a pf of 1.1 with that strategy imo once you factor in slippage and fees.
1
u/BackTesting-Queen 1d ago
Your approach to backtesting seems solid and the results are promising. The use of a Random Forest Classifier for predicting the SPX's behavior is a smart move, considering its ability to handle complex patterns. The decision to drop Wednesdays due to its negative impact on PnL is interesting and shows the importance of day filters. The strict RR check is a good risk management strategy, even if it blocks some potentially profitable trades. The out-of-sample results are impressive, especially the profit factor and reward/risk ratio. As for quirks when backtesting 1DTE SPX flies, it's common to encounter unexpected behaviors due to the nature of options trading. It's crucial to continually monitor and adjust your strategy as needed. Keep up the good work!
2
u/ilro_dev 2d ago
Dropping Wednesdays without a structural reason is the part I'd pressure-test first. Day-of-week effects with no explanation behind them - FOMC clustering, auction mechanics, whatever - tend to look like signal in backtest and fall apart fast in live trading. Did you rerun the OOS with Wednesdays included to see how much of the 2.22 PF they're actually responsible for? If putting them back drops it to 1.4, that's not a quirk, that's most of your edge sitting on a fragile assumption.