Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares


in reply to xiao yun

This is almost too perfect but they'll learn nothing. Rather than make concessions to get rid of the ridiculous capitalistic & oligarchic system & promote rapid innovation through community & open source, they'll just demand that government give them even more trillions in tax payer money so they can compete with an open source model.
This entry was edited (10 months ago)
in reply to UnderpantsWeevil

Wow ok, you really dont know what you're talking about huh?

No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch, nor the millions of dollars to pay for electricity.

I'm sure someone will and I'm glad this has been open sourced, its a great boon. But that's still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.

in reply to Womble

No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch


Fortunately, you don't need thousands of top of the line cards to train the DeepSeek model. That's the innovation people are excited about. The model improves on the original LLM design to reduce time to train and time to retrieve information.

Contrary to common belief, an LLM isn't just a fancy Wikipedia. Its a schema for building out a graph of individual pieces of data, attached to a translation tool that turns human-language inputs into graph-search parameters. If you put facts about Tianamen Square in 1989 into the model, you'll get them back as results through the front-end.

You don't need to be scared of technology just because the team that introduced the original training data didn't configure this piece of open-source software the way you like it.

that’s still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.
Wow ok, you really dont know what you’re talking about huh?


in reply to UnderpantsWeevil

analyticsvidhya.com/blog/2024/…

Huh I guess 6 million USD is not millions eh? The innovation is it's comparatively cheap to train, compared to the billions OpenAI et al are spending (and that is with acquiring thousands of H800s not included in the cost).

Edit: just realised that was for the wrong model! but r1 was trained in the same budget
https://x.com/GavinSBaker/status/1883891311473782995?mx=2

This entry was edited (10 months ago)
in reply to Womble

The innovation is it’s comparatively cheap to train, compared to the billions


Smaller builds with less comprehensive datasets take less time and money. Again, this doesn't have to be encyclopedic. You can train your model entirely on a small sample of material detailing historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.

This entry was edited (10 months ago)
in reply to Dhs92

No it's not a feature of ollama, thats the innovation of the "chain of thought" models like OpenAI's o1 and now this deepseek model, it narrates an internal dialogue first in order to try and create more consistent answers. It isnt perfect but it helps it do things like logical reasoning at the cost of taking a lot longer to get to the answer.
This entry was edited (10 months ago)
in reply to JasSmith

No, that was me running the model on my own machine not using deepseek's hosted one. What they were doing was justifying blatent politcal censorship by saying anyone could spend millions of dollars themselves to follow their method and make your own model.

You'll notice how they stopped replying to my posts and started replying to others once it became untenable to pretend it wasnt censorship.

This entry was edited (10 months ago)
in reply to JasSmith

But yeah. Anyone who thinks the app / stock model isn't going to be heavily censored...

Why else would it be free? It's absolutely state sponsored media. Or it's a singularity and they're just trying to get people to run it from within their networks, the former being far more plausible.

And I know, I know, the latter isn't how llms work. But would we really know when it isn't?

This entry was edited (10 months ago)
in reply to Bronzebeard

So the idea with this comment:

The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.


is that people have misplaced their concern, not at the fact that it's censored but that the US has lost the technology high ground and won't get it back for at least a generation?

in reply to Womble

Try an abliterated version of the qwen 14b or 32b R1 distills. I just tried it out they will give you a real overview.

Still even when abliterated its just not very knowledgeable about "harmful information". If you want a truly uncensored model hit up mistral small 22b and its even more uncensored fine tune Beepo 22b

This entry was edited (10 months ago)
in reply to Womble

It's even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can't do that.

If you ask it "what is the republic of china" it will generate a couple paragraphs of the history of China, then it'll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.

in reply to xiao yun

Reposting from a similar post, but... I went over to huggingface and took a look at this.

Deepseek is huge. Like Llama 3.3 huge. I haven't done any benchmarking, which I'm guessing is out there, but it surely would take as much Nvidia muscle to run this at scale as ChatGPT, even if it was much, much cheaper to train, right?

So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same... or is the stock market just clueless and dumb and they're all running on vibes at all times anyway?

in reply to MudMan

Clueless dumb vibes, yeah. But exaggerated by the media for clicks, too - Nvidia price is currently the same as it was in Sept 2024. Not really a huge deal.

Anyway, the more efficient it is the more usage there will be and in the longer run more GPUs will be needed - greenchoices.org/news/blog-pos…

in reply to Rimu

Sure, 15% isn't the worst adjustment we've seen in a tech company by a long shot, even if the absolute magnitude of that loss is absolutely ridiculous because Nvidia is worth all the money, apperently.

But everybody is acting like this is a seismic shock, which is fascinatingly bizarre to me. It seems the normie-investor axis really believed that forcing Nvidia to sell China marginally slower hardware was going to cripple their ability to make chatbots permanently, which I feel everybody had called out as being extremely not the case even before these guys came up with a workaround for some of the technical limitations.

in reply to MudMan

It comes in different versions, some of which are enormous and some that are pretty modest. The thing is they are not competing with 4o, but with o1, which has really massive resource requirements. The big news this week was supposed to be o3 opening up to the public, which involved another huge resource jump, and set a clear trajectory for the future of AI being on an exponential curve of computing power. This was great news for companies that made the parts and who could afford the massive buildout to support future developments. Deepseek isnt so much disruptive for its own capabilities, its disruptive because challenges the necessity of this massive buildout.
in reply to jrs100000

I suppose the real change is on the assumption that we were going to have to go all paperclip maximizer on Nvidia GPUs forever and on that front yeah, Nividia would have become marginally less the owner of a planet made from GPUs all the way through.

They're still the owner of a planet made of rock where people run AIs on GPUs, though. Which I guess is worth like 15% less or whatever.

in reply to MudMan

in reply to DannyBoy

Check the P/E on a lot of these firms. So much of the valuation is predicated on enormous rates of future growth. Their revenue isn't keeping up with their valuation. A big chunk of that 2000% is just people trading on the Greater Fool who will buy the shares later.

Microsoft will be fine, sure. Meta will be fine, sure. The guy leveraged to the tits to go long on ARK Innovation ETF? Far more dubious.

This entry was edited (10 months ago)
in reply to reallykindasorta

Software. I know surface level about the current AI environment and I have friends saying buy to Nvidia but I was wondering when there would be improvements to the software.

Example, we used to need a pretty top notch PC to play Doom but now we can emulate the hardware and run it on a pregnancy test.

in reply to Pistcow

This assumes some kind of eureka innovation, right? A 96% reduction in compute demands per "token" is revolutionary. I haven't seen anyone yet explain what that innovation is, exactly. There is also mixed reporting on how "open source" DeekSeek is, with many claiming it's only "open weight," meaning people are having difficulty reproducing the creation of the model. It wouldn't be the first time that a claim out of China were false, and I think it wise to reproduce any such claims before running around with our arms in the air.
in reply to Bronzebeard

Word definitions are exactly the sorts of things one would expect an LLM to pick up on.

ChatGPT:

A "tankie" is a term often used to describe a person who is an ardent supporter of authoritarian regimes, particularly those that claim to be socialist or communist. The term originally referred to members of communist parties who supported the Soviet Union's use of tanks to suppress uprisings, like the 1956 Hungarian Revolution and the 1968 Prague Spring. Over time, it's been used more broadly to refer to people who justify or defend the actions of regimes like the Soviet Union, China under Mao, or North Korea, often in the face of human rights abuses or authoritarian policies.

The label can sometimes be used pejoratively to imply that someone is uncritical of authoritarianism or blindly supportive of these regimes due to ideological alignment, even when those regimes engage in actions that contradict the values they claim to uphold.

This entry was edited (10 months ago)
in reply to Bronzebeard

Global News reshared this.

in reply to Rimu

This entry was edited (10 months ago)
in reply to wiki_me