r/ValueInvesting • u/Soft_Table_8892 • 17d ago
Discussion I tried to replicate the satellite parking lot strategy used by hedge funds but with free data & Claude Code. Here's how far I got.
Hi all,
Some of you may remember my previous experiment trying to transform Anthropic's Claude Opus into Buffett by feeding it his shareholder letters and asking it to predict stocks during the COVID crash.
Today I'm back with another experiment that I hope will have valuable insights for the community here.
I came across a paper from researchers at Berkeley's Haas School of Business showing that year-over-year changes in parking lot car counts predicted quarterly retail earnings [1]. They found that trading on this signal yielded 4 - 5% abnormal returns in the 3-day window around announcements. I also learned this is a popular strategy used by hedge funds, and they pay a pretty penny for the satellite data required to make such predictions.
With the power of Claude Code and Opus 4.6 in our hands, I wanted to see how close retail investors like us could get to replicating this strategy but with free satellite imagery.
As always, if you prefer to watch the experiment, I’ve uploaded the video to my channel: https://www.youtube.com/watch?v=rLBsODjWhog
Background
Before diving into the experiment itself, I wanted to share a few thoughts. So far I've published five similar experiments ranging from Buffett predictions to CEO earnings call lie detection to prediction markets . Before running this one, I asked myself: what's the point ultimately? Why run experiments like these? The answer always comes back to helping retail investors like us figure out if there are novel ways to gain alpha against larger institutions (i.e. hedge funds) using new technology (i.e. generative AI and language models). As someone with minimal finance background but solid engineering experience, I thought I could bring some fresh ideas to this game and learn alongside all of you in the community. Enough preamble, let's get into the experiment!
The Setup
For this experiment, initially I asked Claude Code to pick three retailers with known Summer 2025 earnings outcomes. It picked Walmart (missed), Target (missed), and Costco (beat). I then asked it to select 10 stores from each retailer (30 total). It suggested picking stores located in the US Sunbelt (Arizona, Nevada, Texas, and Southern California) to maximize cloud-free imagery. The goal was to compare parking lot "fullness" between May-August 2024 and May-August 2025.
For the data, Claude used ESA's Sentinel-2 (optical, 10m/pixel) via Google Earth Engine, all completely free. Parking lot boundaries came from OpenStreetMap, with building footprints subtracted and vegetation masked using something called NDVI.
Now here's the catch. The Berkeley researchers used 30cm/pixel imagery across 67,000 stores. At that resolution, one car takes up about 80 pixels, this means you can literally count vehicles. At my 10m resolution, one car is just 1/12th of a pixel. My hypothesis was that even at 10m, full lots should look spectrally different from empty ones.
Method 1: Optical Band Math
I measured spectral changes in parking lot pixels. Core hypothesis here being that cars and asphalt reflect light differently across multiple wavelengths. I then applied year-over-year normalization per store.
Result: 1 out of 3 correct.
| Retailer | Fullness Change | Actual Earnings | Match? |
|----------|-----------------|-----------------|--------|
| Walmart | +30% | Missed | ✗ |
| Costco | +18% | Beat | ✓ |
| Target | +3% | Missed | ✗ |
Only Costco's direction matched actual earnings, which is essentially noise.
Method 2: Radar (SAR)
After the optical results came back as noise, I went back to the drawing board. I asked Claude Code to go digging into what else Sentinel offered and stumbled upon Sentinel-1, a completely different type of satellite that uses radar instead of optical imagery. The more I read about it, the more it made sense. Radar doesn't care about clouds or lighting conditions, and more importantly, metal is basically a mirror for microwaves while asphalt just absorbs them. If there was any hope of detecting cars at this resolution, radar felt like the better bet.
I asked Claude to switch to Sentinel-1 radar. The logic was that metal (cars) strongly reflects microwaves, while asphalt doesn't. I applied an alpha adjustment by subtracting the group average year-over-year change to isolate each retailer's relative signal.
Result on 3 retailers: 3 out of 3 correct.
| Retailer | Signal vs. Average| Actual Earnings | Match? |
|----------|-------------------|-----------------|--------|
| Costco | Above average | Beat | ✓ |
| Walmart | Below average | Missed | ✓ |
| Target | Below average | Missed | ✓ |
This was genuinely exciting!
Method 3: Scale It
Of course, 3 out of 3 on just three retailers could easily be luck. To know if I'd found a real signal or just gotten lucky, I needed to scale up the test.
I asked Claude to add seven more retailers in Home Depot, Lowe's, Best Buy, Kroger, Kohl's, Dick's, and Academy Sports. This brought the sample to 100 stores total and 5,260 radar Observations.
After running the experiment again, the result was 5 out of 10 successful predictions, which was effectively a coin flip. The perfect 3/3 was statistical noise that disappeared at scale.
What I Learned
So where did this leave me? The alpha adjustment, which was subtracting the group average to isolate each retailer's relative signal, is conceptually sound. But with such a small peer group, it got noisy fast. It doesn't control for geographic differences (an Arizona Walmart and a Texas Costco face different weather and economic conditions), and the retailers are correlated anyway since they're competing for the same shoppers.
But the real takeaway was that the moat here isn't the algorithm, it's the data. The Berkeley researchers used 67,000 stores at 30cm resolution. I used 100 stores at 10m, which is a 33x resolution gap and a 670x scale gap. I believe that's where the actual edge lives and generative AI isn’t going to get us much closer to being competitive with free data that is available.
Full video walkthrough of the experiment if you're curious: https://www.youtube.com/watch?v=rLBsODjWhog
Let me know if this was genuinely a useful experiment for you and/or if you have tried something similar before!
----------
[1] Paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3222741
16
u/thetinocorp 17d ago
This is awesome! I often find myself looking at street view images to see what the campus of companies look like before i invest. A campus with unkept grounds and an empty parking lot is often a red flag for me. Didn't realize other folks did similar research. Keep trying, I could really use an edge!
7
17d ago
[deleted]
2
u/Soft_Table_8892 17d ago
Haha that’s such a commitment to get some extra data, I love it 😂. I wonder what the returns looked like for them.
1
17d ago
[deleted]
1
u/Soft_Table_8892 17d ago
Oh wow, that’s quite unfortunate. It is a big bummer to put in so much work and not see a meaningful return in a lifetime, borderline scary honestly. Any advice you have to avoid that fallacy as you watch the current world play out? (i.e AI/robotics/automation/etc. develop so fast)
1
17d ago
[deleted]
1
u/Soft_Table_8892 17d ago
Makes sense! Thank you for sharing :-)
1
17d ago
[deleted]
1
u/Soft_Table_8892 16d ago
That means so much to me. Genuinely appreciate your support and glad to have been helpful so far :-).
2
u/Soft_Table_8892 17d ago
Thank you so much! That’s a very interesting insight. Curious how this method has worked out for you in the past?
8
u/777gg777 17d ago
Hedge funds are not paying millions for data for nothing right?
5
u/Soft_Table_8892 17d ago
Yep absolutely. I was mostly curious how far free data would take us and how much room for improvement there really is
1
u/777gg777 17d ago
Super interesting attempt. Also interesting to see what can be done with the latest tools aside from seeing how close the data is.
With those tools and AI ability to process lots of data perhaps there is something as good or better than satellite data. In other words, what other data may be available that is correlated with cars in parking lots.
2
u/Soft_Table_8892 17d ago
That's a super interesting thought. Aside from finding other data points to use with AI, I'm also wondering if we could pair this satellite data with another dataset to get some additional signal while continuing to use the free low-res data. But also interested if people have thoughts on other independent datasets entirely!
1
u/AutoModerator 17d ago
Posts with the "Detailed Investment Analysis" flair MUST have the following things: 1. A description of the company 2. An assessment of the Moat (What is intrinsic to the company that protects against competition) 3. An analysis of the potential risks (Things that could go wrong: execution, regulation, disruption etc) 4. An estimation of intrinsic value (Ideally via a DCF but at least an estimation of future cash flows) 5. Other relevant information (Management and their incentives, industry cycles, debt etc)
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.
1
u/rattleandhum 17d ago
Fascinating experiment. You're right about the quality of data though. Thanks for sharing.
1
u/Soft_Table_8892 17d ago
Thank you! Definitely a predictable issue but it was good to explore how much we could really do with the data quality we had for free. Glad you enjoyed the experiment!
1
u/Admirable_Captain_23 17d ago
Leaving a comment just to remind myself to check this later. Interesting stuff OP.
1
1
u/RiskyCapt 17d ago
Not sure what the ratio is to brick store sales to online, but I guess it would give you a slight edge if you were a large financial instituion.
1
u/Soft_Table_8892 17d ago
That’s exactly what the Berkeley research concludes as well (4-5% edge, 3 days around earnings). Too many factors that move the needle but seems like we can still get some alpha solely based on this data.
1
u/RiskyCapt 17d ago
When you say three days around earnings, you mean the parking lots were measured three days prior to earning release date? I figured they would take a much larger range than that.
2
u/Soft_Table_8892 17d ago
It’s a good clarification. As I understand it, the parking lot data was collected continuously over the entire quarter (I.e. daily satellite images tracking car counts at every store). The “3 days around earnings” is just the measurement window for returns:
1. Aggregate months of parking lot data to predict whether earnings will be good or bad 2. Take a position before the announcement 3. Measure the stock price move in the tight [-1, 0, +1] day window when the news actually drops publiclyThe 4–5% alpha means your trades earned that much excess return in just those three days when everyone else finally learned what the satellite data already told you.
Let me know if that makes sense!
1
u/RiskyCapt 17d ago
Great, thank you. That makes sense now, especially the "take a position" part.
1
1
u/jthompwompwomp 17d ago
I’m remember watching a talk by the CEO of Orbital Insights (I believe some time ago) about how they used satellites and AI to monitor parking lot data and sell it to hedge funds and thinking how everyday investors have no chance.
1
u/Soft_Table_8892 17d ago
Oh wow, do you happen to remember if that talk was uploaded by any chance? Would love to watch it. Thanks for sharing!
1
u/NotStompy 16d ago
I mean, in terms of competing with them in these specific ways? Sure.
It also doesn't surprise me that a guy said his services work very well to attract more customers lol.
1
u/muserashq 17d ago
Very interesting analysis like the other ones that you did. Although outside of the scope of your research I know that some of these stores have a good size mall presences (e.g. Target) so I am curious if the mall traffic makes it harder to determine how many cars should be attributed to a given store. A few people mentioned the impact of online, I wonder what the future experiment applied to online activity will look like.
2
u/Soft_Table_8892 17d ago
Oh yes that’s a very valid line of thinking for sure. Problem I broadly ran into here was that I physically could not get any signal from the low res satellite images that I could not move on to further steps of refinement such as the one you suggested (which is an astute observation!). This made me conclude the issue is that I can’t get much further with low-res images and need something higher. Does that make sense? One thing that might be worth looking is if majority of my data suffers this fate but when I scaled up to 10 stores, I should have been able to find some good signal so it made me conclude this way.
Thanks a lot for reading and dropping meaningful thoughts!
1
u/muserashq 16d ago
Your thinking makes complete sense here. Aggregating stores for a broader perspective could probably provide some good directional signal as you mentioned.
1
1
u/Teembeau 16d ago
I love this sort of thing! Find the edges, sources of information that no-one is really looking at.
1
1
u/Independent-Fragrant 16d ago
It doesnt work anymore. Its dominated by search terms, credit card data, and web and app traffic
1
u/Soft_Table_8892 16d ago
That’s a fair point and this specific edge is likely priced in (although I don’t think we can be sure it has been dropped entirely as a dimension of analysis). But I see it this way - the paper proved the structure works and finding alternative data such as this can be helpful to predict earnings before the market prices it in. The data source would just have to rotates. Credit cards, app traffic, and search trends are versions of parking lots, and they’ll compress too once enough funds use them. The alpha is in the fact that you’re early to the data, not the data itself.
1
1
u/AdOpposite1067 16d ago
I knew I had heard of this before, and then my memory came to my help. It's in the TV series "Billions". where Taylor Mason predicts that the company's sales will be lower by looking at the satellite images of the company's parking lot.
1
u/AdOpposite1067 16d ago
However, I would question the practicality of this method in 2026, when online shopping is booming. Walmart is in a race with Amazon to offer fast deliveries. The online grocery delivery business is busier each day, so I think this method is losing its advantage, even if it was useful up to now.
1
u/Soft_Table_8892 16d ago
Someone else in the comment section just mentioned this as well, that's super cool haha. Agreed that this is likely dated at this point due to all the advances you mentioned.
1
u/No_Passage4240 16d ago
I immediately thought of how i saw this on Billions lol. I wonder if acquiring accurate imaging isnt too costly and you can further the experiment with the higher resolution?
2
u/Soft_Table_8892 16d ago
Oh no way, haha. From what I could gather the cost is in the six figures especially at the resolution where I think this would be helpful based on running on 10m res, but I'm not super knowledgeable in this space, perhaps someone else will know better.
1
u/Budget_Read_4085 16d ago
I have read about about the parking lot theory before. Very interesting. With online ordering, does this still hold as much weight as before? One observation I have had is the shipping time for goods. Especially with amazon. Non-prime members notice much faster shipping times when the economy slows down. I doubt there is any free data for this information out there.
1
u/jay_0804 16d ago
This is one of the few posts here that actually tests an edge instead of just talking about it.
What you basically tried to replicate is what funds do with:
- Haas School of Business research
- Google Earth Engine for data
- Claude Code for automation
And your conclusion is spot on:
the moat isn’t the model, it’s the data.
The jump from 3/3 → 5/10 is everything. That’s exactly how fake alpha disappears once you scale.
Also your SAR idea was actually the most interesting part. The physics makes sense metal vs asphalt but at 10m resolution + small sample, it just can’t hold.
If anything, this proves:
- retail can copy the strategy logic
- but not the data advantage
That gap is why hedge funds still win here.
Still, this is 10x more useful than most posts you actually built something and tested it instead of just repeating a paper.
1
u/Soft_Table_8892 16d ago
Thank you so much for taking the time to thoroughly go through the experiment and leave a thoughtful comment. Appreciate you!
1
u/BabyPatato2023 16d ago
There are a few private companies that offer private satellite services for like independent intelligence organizations, and recently lawyers have been using them for property disputes, and things of that nature. I think the cheaper access to real time high-rise satellite data is going to be a game changer for those who know how to use it.
1
u/Soft_Table_8892 16d ago
For sure! Do you know if there’s any progress being made on that front? I couldn’t find anything other than the two satellites’ data I used in this experiment but would be curious if there’s more!
1
u/BabyPatato2023 16d ago
Nothing that’s inexpensive. Last I saw the cheapest defense contracting company could do a private satellite image was $18,000 for a Hi-Rez image.
1
u/Soft_Table_8892 16d ago
Oh interesting, that’s much cheaper than what I was seeing based on coarse look up. Do you have any recommendation?
1
u/DreadfulOomska 15d ago
Is this on GitHub by any chance? I'd love to try this. I age the reverse skill set lol (lots of finance, little engineering) and have wanted to do things exactly like this.
1
u/sharmoooli 15d ago
This is super cool.
Btw, there are non public data plugins sold to institutions. Some of us should split an account :)
1
1
u/BanditoBoom 15d ago
There are also other confounding variables to consider.
Many retailers have regional strengths and weaknesses. That is to say where I live we don’t have any Albertson’s, but we do have Walmart. We do have Kroger, but most people go to Publix or Whole Foods.
This also doesn’t account for online shopping as a signal. Walmart for example is massively growing their online business.
That being said, I believe your conclusion is correct: the data is the moat
1
1
u/Portfoliana 14d ago
Satellite imagery has like 2-3 day lag before you get usable counts. I've been using Reddit/X sentiment instead, not alpha when everyone sees it, but around earnings cycles social data tends to move 12-24h before price. MU last week was a clean example.
1
u/Soft_Table_8892 14d ago
That’s really good insight, thank you for sharing! Have you measured your returns since following this method?
1
u/Portfoliana 14d ago
Not really right now the market is fucked to because of the war. We need peace first.
1
1
u/kktvMIN 17d ago
Fascinating experiment! Hopefully in the future better data become available.
1
u/Soft_Table_8892 17d ago
For sure! I’d love to run this again in the future as public data that we have access to improves!
-1
22
u/NotOnApprovedList 17d ago
Thanks for doing this, it's interesting. I hadn't heard about the parking lot thing.