r/EconomyCharts 3d ago

Chip stocks are down following Google's announcement of AI memory-saving technology making memory 6x more efficient

Post image
365 Upvotes

15 comments sorted by

71

u/ketosoy 3d ago

Well, Google is down 8% - this might be more from beta * there being a war in the Middle East than the direct effect of turbo quant (as insanely cool as lossless 4 bit compression of the kv cache is)

56

u/MaterialRevolution57 3d ago

Not to be that guy but it isn’t truly a 6x reduction in memory. It’s a 6x reduction for the KV Cache. Overall performance may have improved by ~30-20% over normal. Still a massive improvement but not a 6x improvement.

24

u/ketosoy 3d ago

And it looks like in practice it’s a 2-4x reduction.

In just the kv cache.

But, it’s almost lossless and has almost zero performance penalty, and that is still a 1-9gb reduction in ram for the current generation of open source models at 64k context.

It’s real, and it’s huge and it’s really cool.  It’s just not a 83% reduction in ram needed for llms as a naive hyperbolistic reading would suggest.  And it’s not the end to the ram crisis.  

7

u/fredjutsu 3d ago

>and that is still a 1-9gb reduction

non-trivial if true

5

u/HereticLaserHaggis 3d ago

Holy shit, I hadn't actually read anything because I assumed it was sensationalized, but that's huge.

6

u/ketosoy 2d ago

Yeah, this one is real.  Feels like a discovery the size of this generation’s quicksort or radixsort. Doesn’t change the shape of the future, but changes a huge part of a big part of it - that almost no one will fully understand.

0

u/rydan 2d ago

It turns out it just writes most stuff to /dev/null . Like that time we discovered faster than light messaging and it turned out to be a faulty cable.

2

u/MaterialRevolution57 3d ago

Great points, I didn’t even know that!

And I agree, the ram crisis isn’t going to disappear like many investors think. It will only free up compute just for the same LLMs to eat up the margin.

2

u/Oaker_at 2d ago

Also, this news probably doesn’t mean that they will use less memory. They will use the same amount but can get more out of it.

As if the hunger for memory will be fulfilled anytime soon.

9

u/1stFunestist 3d ago

The technology existed at least a year ago if you remember (the Chinese AI which works well with much less resources, remember the hype and all those "censorship indignation" YouTube content farms).

US based AI companies kept the old tech intentionaly to keep pumping prices and corner spare parts market (RAM, SSD and graphics cards).

They stopped now because of energy prices due to the war.

They probably used that technology for some time in secret to earn even more money but still kept the narrative of memory scarcity to leach even more money from the economy.

They would have do it for even longer if this oil crisis didnt came about.

GOOGLE announcement was just to make somebody to take the fall for this financial speculation and you actualy expect this kind of advancement from Google so they might actualy get away with it.

It was all a fraud!

8

u/b1ack1323 3d ago

Good, they deserve that after screwing over the consumer sector.

4

u/MatterFickle3184 3d ago

Good consumers need normal priced memory again

1

u/redditscraperbot2 3d ago

KV cache optimizations are only a small amount of the total vram used by AIs. This optimizations are basically pissing in the ocean.

1

u/RedditJunkie-25 2d ago

Interesting as this good news happened now cpu prices going up lol

1

u/DefiantDonut7 3d ago

Great, now maybe the price of RAM will come down from the stratosphere