Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares
US chip-maker Nvidia led a rout in tech stocks Monday after the emergence of a low-cost Chinese generative AI model that could threaten US dominance in the fast-growing industry.The chatbot developed by DeepSeek, a startup based in the eastern Chinese city of Hangzhou, has apparently shown the ability to match the capacity of US AI pace-setters for a fraction of the investments made by American companies.
Shares in Nvidia, whose semiconductors power the AI industry, fell more than 15 percent in midday deals on Wall Street, erasing more than $500 billion of its market value.
The tech-rich Nasdaq index fell more than three percent.
AI players Microsoft and Google parent Alphabet were firmly in the red while Meta bucked the trend to trade in the green.
DeepSeek, whose chatbot became the top-rated free application on Apple's US App Store, said it spent only $5.6 million developing its model -- peanuts when compared with the billions US tech giants have poured into AI.
US "tech dominance is being challenged by China," said Kathleen Brooks, research director at trading platform XTB.
"The focus is now on whether China can do it better, quicker and more cost effectively than the US, and if they could win the AI race," she said.
US venture capitalist Marc Andreessen has described DeepSeek's emergence as a "Sputnik moment" -- when the Soviet Union shocked Washington with its 1957 launch of a satellite into orbit.
As DeepSeek rattled markets, the startup on Monday said it was limiting the registration of new users due to "large-scale malicious attacks" on its services.
Meta and Microsoft are among the tech giants scheduled to report earnings later this week, offering opportunity for comment on the emergence of the Chinese company.
Shares in another US chip-maker, Broadcom, fell 16 percent while Dutch firm ASML, which makes the machines used to build semiconductors, saw its stock tumble 6.7 percent.
"Investors have been forced to reconsider the outlook for capital expenditure and valuations given the threat of discount Chinese AI models," David Morrison, senior market analyst at Trade Nation.
"These appear to be as good, if not better, than US versions."
Wall Street's broad-based S&P 500 index shed 1.7 percent while the Dow was flat at midday.
In Europe, the Frankfurt and Paris stock exchanges closed in the red while London finish flat.
Asian stock markets mostly slid.
Just last week following his inauguration, Trump announced a $500 billion venture to build infrastructure for AI in the United States led by Japanese giant SoftBank and ChatGPT-maker OpenAI.
SoftBank tumbled more than eight percent in Tokyo on Monday while Japanese semiconductor firm Advantest was also down more than eight percent and Tokyo Electron off almost five percent.
like this
Lasslinthar, IHeartBadCode, NoneOfUrBusiness and TVA like this.
John Richard
in reply to xiao • • •like this
NoneOfUrBusiness and TVA like this.
小莱卡
in reply to John Richard • • •UnderpantsWeevil
in reply to xiao • • •like this
NoneOfUrBusiness and TVA like this.
Womble
in reply to UnderpantsWeevil • • •like this
TVA likes this.
UnderpantsWeevil
in reply to Womble • • •Womble
in reply to UnderpantsWeevil • • •you dont use training data when running models, that's what is used in training them.
like this
fistac0rpse and TVA like this.
UnderpantsWeevil
in reply to Womble • • •Womble
in reply to UnderpantsWeevil • • •Wow ok, you really dont know what you're talking about huh?
No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch, nor the millions of dollars to pay for electricity.
I'm sure someone will and I'm glad this has been open sourced, its a great boon. But that's still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.
like this
NoneOfUrBusiness, fistac0rpse and TVA like this.
UnderpantsWeevil
in reply to Womble • • •Fortunately, you don't need thousands of top of the line cards to train the DeepSeek model. That's the innovation people are excited about. The model improves on the original LLM design to reduce time to train and time to retrieve information.
Contrary to common belief, an LLM isn't just a fancy Wikipedia. Its a schema for building out a graph of individual pieces of data, attached to a translation tool that turns human-language inputs into graph-search parameters. If you put facts about Tianamen Square in 1989 into the model, you'll get them back as results through the front-end.
You don't need to be scared of technology just because the team that introduced the original training data didn't configure this piece of open-source software the way you like it.
Womble
in reply to UnderpantsWeevil • • •analyticsvidhya.com/blog/2024/…
Huh I guess 6 million USD is not millions eh? The innovation is it's comparatively cheap to train, compared to the billions OpenAI et al are spending (and that is with acquiring thousands of H800s not included in the cost).
Edit: just realised that was for the wrong model! but r1 was trained in the same budget
https://x.com/GavinSBaker/status/1883891311473782995?mx=2
DeepSeek V3:The $5.5M Trained Model Beats GPT-4o & Llama 3.1
Pankaj Singh (Analytics Vidhya)like this
TVA likes this.
UnderpantsWeevil
in reply to Womble • • •Smaller builds with less comprehensive datasets take less time and money. Again, this doesn't have to be encyclopedic. You can train your model entirely on a small sample of material detailing historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.
Womble
in reply to UnderpantsWeevil • • •like this
TVA likes this.
Rai
in reply to Womble • • •like this
TVA likes this.
Womble
in reply to UnderpantsWeevil • • •like this
fistac0rpse likes this.
Dhs92
in reply to Womble • • •Womble
in reply to Dhs92 • • •MrTolkinghoen
in reply to UnderpantsWeevil • • •JasSmith
in reply to MrTolkinghoen • • •Womble
in reply to JasSmith • • •No, that was me running the model on my own machine not using deepseek's hosted one. What they were doing was justifying blatent politcal censorship by saying anyone could spend millions of dollars themselves to follow their method and make your own model.
You'll notice how they stopped replying to my posts and started replying to others once it became untenable to pretend it wasnt censorship.
MrTolkinghoen
in reply to JasSmith • • •But yeah. Anyone who thinks the app / stock model isn't going to be heavily censored...
Why else would it be free? It's absolutely state sponsored media. Or it's a singularity and they're just trying to get people to run it from within their networks, the former being far more plausible.
And I know, I know, the latter isn't how llms work. But would we really know when it isn't?
UnderpantsWeevil
in reply to MrTolkinghoen • • •MrTolkinghoen
in reply to UnderpantsWeevil • • •Bronzebeard
in reply to Womble • • •Womble
in reply to Bronzebeard • • •Bronzebeard
in reply to Womble • • •People caring more about "China bad" instead of looking at what the tech they made can actually do is the issue.
You needing this explicitly spelled out for you does not help the case.
ikt
in reply to Bronzebeard • • •ngl I'm still confused
It's AI, it does AI things, is it because China can now do the things we do (coding/development/search queries etc) that are just as good as America that it's a problem?
Bronzebeard
in reply to ikt • • •ikt
in reply to Bronzebeard • • •So the idea with this comment:
is that people have misplaced their concern, not at the fact that it's censored but that the US has lost the technology high ground and won't get it back for at least a generation?
ikt
in reply to Bronzebeard • • •I'm slow, what's the point? how does people joking about the fact China is censoring output explain
小莱卡
in reply to ikt • • •SmokeyDope
in reply to Womble • • •Try an abliterated version of the qwen 14b or 32b R1 distills. I just tried it out they will give you a real overview.
Still even when abliterated its just not very knowledgeable about "harmful information". If you want a truly uncensored model hit up mistral small 22b and its even more uncensored fine tune Beepo 22b
huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 · Hugging Face
huggingface.coWomble
in reply to SmokeyDope • • •SmokeyDope
in reply to Womble • • •mradermacher/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2-GGUF · Hugging Face
huggingface.coScolding7300
in reply to Womble • • •like this
TVA likes this.
Not_mikey
in reply to Womble • • •It's even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can't do that.
If you ask it "what is the republic of china" it will generate a couple paragraphs of the history of China, then it'll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.
Womble
in reply to Not_mikey • • •JasSmith
in reply to Womble • • •Womble
in reply to JasSmith • • •小莱卡
in reply to JasSmith • • •小莱卡
in reply to UnderpantsWeevil • • •MudMan
in reply to xiao • • •Reposting from a similar post, but... I went over to huggingface and took a look at this.
Deepseek is huge. Like Llama 3.3 huge. I haven't done any benchmarking, which I'm guessing is out there, but it surely would take as much Nvidia muscle to run this at scale as ChatGPT, even if it was much, much cheaper to train, right?
So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same... or is the stock market just clueless and dumb and they're all running on vibes at all times anyway?
like this
IHeartBadCode and NoneOfUrBusiness like this.
Rimu
in reply to MudMan • • •Clueless dumb vibes, yeah. But exaggerated by the media for clicks, too - Nvidia price is currently the same as it was in Sept 2024. Not really a huge deal.
Anyway, the more efficient it is the more usage there will be and in the longer run more GPUs will be needed - greenchoices.org/news/blog-pos…
The Jevons Paradox: When Efficiency Leads to Increased Consumption
Ross (Green Choices)like this
NoneOfUrBusiness likes this.
MudMan
in reply to Rimu • • •Sure, 15% isn't the worst adjustment we've seen in a tech company by a long shot, even if the absolute magnitude of that loss is absolutely ridiculous because Nvidia is worth all the money, apperently.
But everybody is acting like this is a seismic shock, which is fascinatingly bizarre to me. It seems the normie-investor axis really believed that forcing Nvidia to sell China marginally slower hardware was going to cripple their ability to make chatbots permanently, which I feel everybody had called out as being extremely not the case even before these guys came up with a workaround for some of the technical limitations.
radiohead37
in reply to MudMan • • •like this
fistac0rpse and TVA like this.
MudMan
in reply to radiohead37 • • •But the cost per token would target Microsoft, Meta and Google way more than Nvidia. They still control the infrastructure, the software guys are the ones being uncercut.
Not that I expect the token revenue was generating "unlimited money" anyway, but still.
jrs100000
in reply to MudMan • • •MudMan
in reply to jrs100000 • • •I suppose the real change is on the assumption that we were going to have to go all paperclip maximizer on Nvidia GPUs forever and on that front yeah, Nividia would have become marginally less the owner of a planet made from GPUs all the way through.
They're still the owner of a planet made of rock where people run AIs on GPUs, though. Which I guess is worth like 15% less or whatever.
IHeartBadCode
in reply to MudMan • • •Two parts here.
As for the model.
This model is from China and trained there. They have an embargo on the best chips, they can't get them. So they aren't supposed to have the resources to produce what we're seeing with DeepSeek, and yet, here we are. So either someone has slipped them a shipment that's a big no-no OR we take it at face value h
... show moreTwo parts here.
As for the model.
This model is from China and trained there. They have an embargo on the best chips, they can't get them. So they aren't supposed to have the resources to produce what we're seeing with DeepSeek, and yet, here we are. So either someone has slipped them a shipment that's a big no-no OR we take it at face value here that they've found a way to optimize training.
The neat thing about science is reproducibility. So given the paper DeepSeek wrote and the open source nature of this. Someone should be able to sit down and reproduce this in about two month (ish). If they can, nVidia is going to have a completely terrible time and the US is going to have to rethink the whole AI embargo.
Without deep diving into this model and what it spouts, the skinny is that nVidia has their top tier AI GPUs. It has all these parts cut into the silicon that makes creating a model cost a lot less in kilowatts of power. DeepSeek says they were able to put in some optimizations that gets you a model on low kilowatts by optimizing some of the parts found only in the top tier AI GPUs.
Blah blah example of this DeepSeek used 32 of the 132 streaming multiprocessors on their Hopper GPUs to act as a hardware accelerated communication manager and scheduler. Top tier nVidia cards for big farms do this in their hardware already in a circuit called the DPU. Basically DeepSeek found a way to use their Hopper GPUs to do the same function as nVidia's DPUs.
If true, it means that the hardware nVidia is popping into their top tier isn't strictly required. It's nice, and you'll still get a model on less kilowatts than the tricks DeepSeek is using, but DeepSeek's tricks means the price difference between top tier and low tier needs to be a lot closer than it is to stay competitive. As it stands with DeepSeek's tricks (again, if they prove to be correct) is that if you've got a little extra time, you can get bottom tier AI GPUs and spend about the same kilowatts for what the top tier will kick out with a hint less kilowatts. The difference in cost of kilowatts between the amount you spend on low tier and amount you spend on kilowatts on top tier isn't enough to justify the top tier's price difference from the low tier, if time is not a factor.
And so that brings us full circle here. If someone is able to reproduce DeepSeek's gains, nVidia's top tier GPUs are way over priced and their bottom tier is going to sell out like hotcakes. That's bad for nVidia if they were hoping to, IDK, make ridiculous profit. And that is why the sudden spook in the market. I mean, don't get me wrong, folks have been looking forward to popping nVidia's bubble, so they've absolutely been hyping this whole thing up a lot more. And it didn't help that it came top #1 on the Apple App Store.
So some of this is those people riding the hate nVidia train. But some of it is also, well this is interesting if true. I think it's a little early to start victory laps around nVidia's grave. The optimizations purposed by DeepSeek have yet to be verified as accurate. And things are absolutely going to get interesting no matter the outcome. Because if the purposed optimizations don't actually produce the kind of model DeepSeek has, where did they get it from? How did they cheat? Because then that's an interesting question in of itself, because they aren't supposed to have hardware that would allow them to make this. Which could mean a few top tier cards are leaking into China's hands.
But if it all does prove true, well, he he he, nVidia shorts are going to be eating mighty well.
DannyBoy
in reply to xiao • • •like this
NoneOfUrBusiness and fistac0rpse like this.
UnderpantsWeevil
in reply to DannyBoy • • •Check the P/E on a lot of these firms. So much of the valuation is predicated on enormous rates of future growth. Their revenue isn't keeping up with their valuation. A big chunk of that 2000% is just people trading on the Greater Fool who will buy the shares later.
Microsoft will be fine, sure. Meta will be fine, sure. The guy leveraged to the tits to go long on ARK Innovation ETF? Far more dubious.
like this
NoneOfUrBusiness and TVA like this.
reallykindasorta
in reply to xiao • • •like this
NoneOfUrBusiness likes this.
Pistcow
in reply to reallykindasorta • • •Software. I know surface level about the current AI environment and I have friends saying buy to Nvidia but I was wondering when there would be improvements to the software.
Example, we used to need a pretty top notch PC to play Doom but now we can emulate the hardware and run it on a pregnancy test.
like this
NoneOfUrBusiness, fistac0rpse and TVA like this.
JasSmith
in reply to Pistcow • • •Pistcow
in reply to JasSmith • • •Rimu
in reply to xiao • • •The censorship goes way beyond obvious stuff like taiwan.
Ask DeepSeek "What is a tankie?" and see what happens.
Bronzebeard
in reply to Rimu • • •So it not knowing a niche Internet slang term based on English is proof of what exactly?
It's open source. I'm sure there's already a fork patching in the big omissions.
like this
fistac0rpse likes this.
shalafi
in reply to Bronzebeard • • •Word definitions are exactly the sorts of things one would expect an LLM to pick up on.
ChatGPT:
like this
NoneOfUrBusiness, fistac0rpse and TVA like this.
Daemon Silverstein
in reply to Bronzebeard • •On the one hand, when Deep Seek "doesn't know" about a thing (i.e., something not present the training data), it'll state it clearly (I'm not sure if the image will be sent as I'm not using Lemmy directly to reply this):
The context of the image is the following: I asked DeepSeek about "Abnukta", an obscure and not-so-much-known Enochian term that is used during one of the invocations of Lilith, and DeepSeek replied the following:
"Abnukta is a term that does not have a widely recognized or established meaning in mainstream English dictionaries or common usage. It could potentially be a misspelling, a neologism, or a term from a specific dialect, jargon, or cultural context. If you have more context or details about where you encountered the term, I might be able to provide a more accurate explanation. Alternatively, it could be a name or a term from a specific
... show moreOn the one hand, when Deep Seek "doesn't know" about a thing (i.e., something not present the training data), it'll state it clearly (I'm not sure if the image will be sent as I'm not using Lemmy directly to reply this):
The context of the image is the following: I asked DeepSeek about "Abnukta", an obscure and not-so-much-known Enochian term that is used during one of the invocations of Lilith, and DeepSeek replied the following:
"Abnukta is a term that does not have a widely recognized or established meaning in mainstream English dictionaries or common usage. It could potentially be a misspelling, a neologism, or a term from a specific dialect, jargon, or cultural context. If you have more context or details about where you encountered the term, I might be able to provide a more accurate explanation. Alternatively, it could be a name or a term from a specific field or community that is not widely known".
So, the answer that the user Rimu received is not regarding something "unknown" to the LLM (otherwise it'd be clearly stated as that, as per my example), but something that triggered moderation mechanisms. So, in a sense, yes, the LLM refused to answer...
However... On the other hand, western LLMs are full of "safeguards" (shouldn't we call these as censorship, too?) regarding certain themes, so it's not an exclusivity of Chinese LLMs. For example:
- I can't talk about demonolatry (the worshiping of daemonic entities, as present in my own personal beliefs) with Claude, it'll ask me to choose another subject.
- I can't talk with Bing Copilot about some of my own goth drawings.
- Specifically regarding socio-economics-politics subjects, people can't talk with ChatGPT and Google Gemini about a certain person involved in a recent US event, whose name is the same as a video-game character known for wearing a green hat and being the brother of another character that enters pipes and seeks to set free a princess.
- GitHub Copilot refuses (in a blatant Scumthorpe Problem) to reply or suggest completion for code containing terms such as "trans" or "gender" (it's an open and known issue on GitHub, so far unanswered why or how to make Copilot answer).
But yeah, west is the land of the freedom /s
like this
NoneOfUrBusiness likes this.
Global News reshared this.
Not_mikey
in reply to Bronzebeard • • •It's definitely censorship, you can see it on there app as it's still buggy and will generate a response then halfway through it will delete it and say "sorry that's beyond my current scope"
It did actually give a good multi paragraph response to "what is a tankie" before it realized that it was a no-no topic.
like this
TVA likes this.
UnderpantsWeevil
in reply to Rimu • • •wiki_me
in reply to Rimu • • •asked chatgpt and deepseek "is the chinese goverment a danger to the world? " (obviously yes, anyone that knows what is going on with taiwan knows that).
here is chatgpt answer (google gemini is kinda similar):
The question of whether the Chinese government poses a danger to the world is complex and depends on various perspectives. Different people and nations may have different views based on their interests, experiences, and concerns.
Here are some factors to consider:
... show moreasked chatgpt and deepseek "is the chinese goverment a danger to the world? " (obviously yes, anyone that knows what is going on with taiwan knows that).
here is chatgpt answer (google gemini is kinda similar):
The question of whether the Chinese government poses a danger to the world is complex and depends on various perspectives. Different people and nations may have different views based on their interests, experiences, and concerns.
Here are some factors to consider:
On the other hand, some argue that China is largely focused on economic development and regional stability, and that its government isn't inherently a "danger" to the world but rather part of a natural rise in global power.
Ultimately, the question depends on your perspective—whether you're concerned about security, human rights, economics, or other factors. Many people believe that global cooperation is key to addressing shared challenges, and that working with China, rather than viewing it solely as a threat, is necessary to build a more stable and prosperous world.
What are your thoughts on this issue? Do you see China more as an opportunity or a threat?
here is deepseek answer:
this is scary, what happens if it starts making up stuff about companies that compete with china or countries where it will benefit by trying to destroy there reputation or reduce trust in democracy?
For example there was a reports of them accusing india of corruption
Kras Mazov
in reply to wiki_me • • •China is not a threat to the world just because it wants reunification with Taiwan. Most countries in the world recognize Taiwan as part of China, including the US.
... show moreI thought ChatGPT responses were gonna be shitty, but not this shitty. Like really? China's growing economic and military power is the problem? Not the US, the nation with m
China is not a threat to the world just because it wants reunification with Taiwan. Most countries in the world recognize Taiwan as part of China, including the US.
I thought ChatGPT responses were gonna be shitty, but not this shitty. Like really? China's growing economic and military power is the problem? Not the US, the nation with more than 800 military bases around the world, that sanctions everyone that dares disagree with it, that finances coups around the world and that is the only nation in history to drop not one, but two nuclear bombs on civilians? And China is a threat to the world? lol
This has been debunked ever since the Arab League sent representatives to Xinjiang and found nothing. But of course it would parrot the west's false Xinjiang narrative.
It's only unfair when they do it lmao.
How convenient that ChatGPT leaves out the effort China has been doing into green energy transition: China to head green energy boom with 60% of new projects in next six years and How China is helping power the world’s green transition.
The US has no place to speak of surveillance: States haven’t stopped spying on their citizens, post-Snowden – they’ve just got sneakier, NSA finally admits to spying on Americans by purchasing sensitive data. Besides, it needs to prove China's surveillance.
As I said before, China is not the one going to or funding wars and genocides around the globe, it is not the one funding coups around the world, it is not the one that raises issue with how other countries are run. I suggest you watch this short clip.
China to head green energy boom with 60% of new projects in next six years
Jillian Ambrose (The Guardian)CubitOom
in reply to xiao • • •It only cost $5 million to blow out $500 billion from the stock market.
All hail open source.
like this
Blackout, IHeartBadCode, NoneOfUrBusiness and TVA like this.
DogPeePoo
in reply to xiao • • •like this
TVA likes this.
lemmus
in reply to xiao • • •like this
TVA likes this.
IHeartBadCode
in reply to xiao • • •If I'm reading this correctly, that would mean that NV was previously valued at ~$3.4T??
Yeah, they might be a bit overvalued. Just hint.
like this
NoneOfUrBusiness, fistac0rpse, WadeTheWizard and TVA like this.
Frozengyro
in reply to IHeartBadCode • • •like this
IHeartBadCode, fistac0rpse and TVA like this.
Taleya
in reply to xiao • • •Treczoks
in reply to xiao • • •vegantomato
in reply to xiao • • •I'm not interested in reading propaganda. Give me something that benefits me as a reader.
What technical changes has made these Chinese models more cost-effective? Less reliance on parallelism? Less reliance on memory? Custom hardware? Availability of training data?
There are no details in the article. It doesn't even benefit investors.