Global News

globalnews@lemmy.zip

(Group)

What is global news?

Something that happened or was uncovered recently anywhere in the world. It doesn’t have to have global implications. Just has to be informative in some way.

Post guidelines

Title format
Post title should mirror the news source title.
URL format
Post URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
[Opinion] prefix
Opinion (op-ed) articles must use [Opinion] prefix before the title.
Country prefix
Country prefix can be added to the title with a separator (|, :, etc.) where title is not clear enough from which country the news is coming from.

Rules

This community is moderated in accordance with the principles outlined in Article 19 of the Universal Declaration of Human Rights, which emphasizes the right to freedom of opinion and expression. In addition to this foundational principle, we have some additional rules to ensure a respectful and constructive environment for all users.
1. English only
Title and associated content has to be in English.
2. No social media posts
Avoid all social media posts. Try searching for a source that has a written article or transcription on the subject.
3. Respectful communication
All communication has to be respectful of differing opinions, viewpoints, and experiences.
4. Inclusivity
Everyone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation. Any kind of discrimination is will not be tolerated.
5. Ad hominem attacks
Any kind of personal attacks are expressly forbidden. If you can’t argue your position without attacking a person’s character, you already lost the argument.

6. Off-topic tangents
Stay on topic. Keep it relevant.
7. Instance rules may apply
If something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.

Companion communities

!legalnews@lemmy.zip - International and local legal news.
!technology@lemmy.zip - Technology, social media platforms, informational technologies and tech policy.
!interestingshare@lemmy.zip - Fascinating articles, captivating images, satisfying videos, interesting projects, stunning research and more.
!europe@lemmy.dbzer0.com - News and information about Europe.
!usa@midwest.social - News and information about United States of America.

Icon generated via LLM model | Banner attribution

If someone is interested in moderating this community, message @brikox@lemmy.zip.

lemmy

xiao yun via Global News

1 year ago

xiao yun
1 year ago

Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

US chip-maker Nvidia led a rout in tech stocks Monday after the emergence of a low-cost Chinese generative AI model that could threaten US dominance in the fast-growing industry.
The chatbot developed by DeepSeek, a startup based in the eastern Chinese city of Hangzhou, has apparently shown the ability to match the capacity of US AI pace-setters for a fraction of the investments made by American companies.
Shares in Nvidia, whose semiconductors power the AI industry, fell more than 15 percent in midday deals on Wall Street, erasing more than $500 billion of its market value.
The tech-rich Nasdaq index fell more than three percent.
AI players Microsoft and Google parent Alphabet were firmly in the red while Meta bucked the trend to trade in the green.
DeepSeek, whose chatbot became the top-rated free application on Apple's US App Store, said it spent only $5.6 million developing its model -- peanuts when compared with the billions US tech giants have poured into AI.
US "tech dominance is being challenged by China," said Kath

US chip-maker Nvidia led a rout in tech stocks Monday after the emergence of a low-cost Chinese generative AI model that could threaten US dominance in the fast-growing industry.
The chatbot developed by DeepSeek, a startup based in the eastern Chinese city of Hangzhou, has apparently shown the ability to match the capacity of US AI pace-setters for a fraction of the investments made by American companies.
Shares in Nvidia, whose semiconductors power the AI industry, fell more than 15 percent in midday deals on Wall Street, erasing more than $500 billion of its market value.
The tech-rich Nasdaq index fell more than three percent.
AI players Microsoft and Google parent Alphabet were firmly in the red while Meta bucked the trend to trade in the green.
DeepSeek, whose chatbot became the top-rated free application on Apple's US App Store, said it spent only $5.6 million developing its model -- peanuts when compared with the billions US tech giants have poured into AI.
US "tech dominance is being challenged by China," said Kathleen Brooks, research director at trading platform XTB.
"The focus is now on whether China can do it better, quicker and more cost effectively than the US, and if they could win the AI race," she said.
US venture capitalist Marc Andreessen has described DeepSeek's emergence as a "Sputnik moment" -- when the Soviet Union shocked Washington with its 1957 launch of a satellite into orbit.
As DeepSeek rattled markets, the startup on Monday said it was limiting the registration of new users due to "large-scale malicious attacks" on its services.
Meta and Microsoft are among the tech giants scheduled to report earnings later this week, offering opportunity for comment on the emergence of the Chinese company.
Shares in another US chip-maker, Broadcom, fell 16 percent while Dutch firm ASML, which makes the machines used to build semiconductors, saw its stock tumble 6.7 percent.
"Investors have been forced to reconsider the outlook for capital expenditure and valuations given the threat of discount Chinese AI models," David Morrison, senior market analyst at Trade Nation.
"These appear to be as good, if not better, than US versions."
Wall Street's broad-based S&P 500 index shed 1.7 percent while the Dow was flat at midday.
In Europe, the Frankfurt and Paris stock exchanges closed in the red while London finish flat.
Asian stock markets mostly slid.
Just last week following his inauguration, Trump announced a $500 billion venture to build infrastructure for AI in the United States led by Japanese giant SoftBank and ChatGPT-maker OpenAI.
SoftBank tumbled more than eight percent in Tokyo on Monday while Japanese semiconductor firm Advantest was also down more than eight percent and Tokyo Electron off almost five percent.

Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

^RFI

like this

in reply to xiao yun

John Richard

in reply to xiao yun 1 year ago

This is almost too perfect but they'll learn nothing. Rather than make concessions to get rid of the ridiculous capitalistic & oligarchic system & promote rapid innovation through community & open source, they'll just demand that government give them even more trillions in tax payer money so they can compete with an open source model.

This entry was edited (1 year ago)

like this

in reply to John Richard

小莱卡

in reply to John Richard 1 year ago

they can dig a deeper hole but this does not help with deepseek being 30x cheaper.

in reply to xiao yun

UnderpantsWeevil

in reply to xiao yun 1 year ago

The number of people repeating "I bet it won't tell you about Tianamen Square" jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.

like this

in reply to UnderpantsWeevil

Womble

in reply to UnderpantsWeevil 1 year ago

The media in this post is not displayed to visitors. To view it, please go to the original post.

It's not a joke, it wont:

like this

in reply to Womble

UnderpantsWeevil

in reply to Womble 1 year ago

What training data did you use?

in reply to UnderpantsWeevil

Womble

in reply to UnderpantsWeevil 1 year ago

???
you dont use training data when running models, that's what is used in training them.

like this

in reply to Womble

UnderpantsWeevil

in reply to Womble 1 year ago

DeepSeek open-sourced their model. Go ahead and train it on different data and try again.

in reply to UnderpantsWeevil

Womble

in reply to UnderpantsWeevil 1 year ago

Wow ok, you really dont know what you're talking about huh?

No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch, nor the millions of dollars to pay for electricity.

I'm sure someone will and I'm glad this has been open sourced, its a great boon. But that's still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.

like this

in reply to Womble

UnderpantsWeevil

in reply to Womble 1 year ago

No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch

Fortunately, you don't need thousands of top of the line cards to train the DeepSeek model. That's the innovation people are excited about. The model improves on the original LLM design to reduce time to train and time to retrieve information.

Contrary to common belief, an LLM isn't just a fancy Wikipedia. Its a schema for building out a graph of individual pieces of data, attached to a translation tool that turns human-language inputs into graph-search parameters. If you put facts about Tianamen Square in 1989 into the model, you'll get them back as results through the front-end.

You don't need to be scared of technology just because the team that introduced the original training data didn't configure this piece of open-source software the way you like it.

that’s still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.

Wow ok, you really dont know what you’re talking about huh?

in reply to UnderpantsWeevil

Womble

in reply to UnderpantsWeevil 1 year ago

analyticsvidhya.com/blog/2024/…

Huh I guess 6 million USD is not millions eh? The innovation is it's comparatively cheap to train, compared to the billions OpenAI et al are spending (and that is with acquiring thousands of H800s not included in the cost).

Edit: just realised that was for the wrong model! but r1 was trained in the same budget
https://x.com/GavinSBaker/status/1883891311473782995?mx=2

DeepSeek V3:The $5.5M Trained Model Beats GPT-4o & Llama 3.1

DeepSeek V3: A 685B parameter AI model, trained for just $5.5M, outperforms GPT-4o & Llama 3.1. Open-source, cost-efficient, and versatile.

^{Pankaj Singh (Analytics Vidhya)}

This entry was edited (1 year ago)

like this

in reply to Womble

UnderpantsWeevil

in reply to Womble 1 year ago

The innovation is it’s comparatively cheap to train, compared to the billions

Smaller builds with less comprehensive datasets take less time and money. Again, this doesn't have to be encyclopedic. You can train your model entirely on a small sample of material detailing historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.

This entry was edited (1 year ago)

in reply to UnderpantsWeevil

Womble

in reply to UnderpantsWeevil 1 year ago

Ok sure, as I said before I am grateful that they have done this and open sourced it. But it is still deliberately politically censored, and no "Just train your own bro" is not a reasonable reply to that.

like this

in reply to Womble

Rai

in reply to Womble 1 year ago

They know less than I do about LLMs of that’s something they think you can just DO… and that’s saying a lot.

like this

in reply to UnderpantsWeevil

Womble

in reply to UnderpantsWeevil 1 year ago

The media in this post is not displayed to visitors. To view it, please go to the original post.

Oh, by the way, as to your theory of "maybe it just doesnt know about Tiananmen, its not an encyclopedia"...

This entry was edited (1 year ago)

like this

in reply to Womble

Dhs92

in reply to Womble 1 year ago

I don't think I've seen that internal dialog before with LLMs. Do you get that with most models when running using ollama?

in reply to Dhs92

Womble

in reply to Dhs92 1 year ago

No it's not a feature of ollama, thats the innovation of the "chain of thought" models like OpenAI's o1 and now this deepseek model, it narrates an internal dialogue first in order to try and create more consistent answers. It isnt perfect but it helps it do things like logical reasoning at the cost of taking a lot longer to get to the answer.

This entry was edited (1 year ago)

in reply to UnderpantsWeevil

MrTolkinghoen

in reply to UnderpantsWeevil 1 year ago

Idk why you're getting downvoted. This right here.

in reply to MrTolkinghoen

JasSmith

in reply to MrTolkinghoen 1 year ago

Because the parent comment by Womble is about using the Chinese hosted DeepSeek app, not hosting the model themselves. The user above who responded either didn't read the original comment carefully enough, or provided a very snarky response. Neither is particularly endearing.

in reply to JasSmith

Womble

in reply to JasSmith 1 year ago

No, that was me running the model on my own machine not using deepseek's hosted one. What they were doing was justifying blatent politcal censorship by saying anyone could spend millions of dollars themselves to follow their method and make your own model.

You'll notice how they stopped replying to my posts and started replying to others once it became untenable to pretend it wasnt censorship.

This entry was edited (1 year ago)

in reply to JasSmith

MrTolkinghoen

in reply to JasSmith 1 year ago

The media in this post is not displayed to visitors. To view it, please go to the original post.

But yeah. Anyone who thinks the app / stock model isn't going to be heavily censored...

Why else would it be free? It's absolutely state sponsored media. Or it's a singularity and they're just trying to get people to run it from within their networks, the former being far more plausible.

And I know, I know, the latter isn't how llms work. But would we really know when it isn't?

This entry was edited (1 year ago)

in reply to MrTolkinghoen

UnderpantsWeevil

in reply to MrTolkinghoen 1 year ago

The media in this post is not displayed to visitors. To view it, please go to the original post.

Just another normal day in the Lemmyverse

in reply to UnderpantsWeevil

MrTolkinghoen

in reply to UnderpantsWeevil 1 year ago

Lol well. When I saw this I knew the model would be censored to hell, and then the ccp abliteration training data repo made a lot more sense. That being said, the open source effort to reproduce it is far more appealing.

in reply to Womble

Bronzebeard

in reply to Womble 1 year ago

You missed the entire point of their comment

in reply to Bronzebeard

Womble

in reply to Bronzebeard 1 year ago

Maybe they should have been clearer than saying people were joking about it doing something that it actually does if they wanted to make a point.

in reply to Womble

Bronzebeard

in reply to Womble 1 year ago

People caring more about "China bad" instead of looking at what the tech they made can actually do is the issue.

You needing this explicitly spelled out for you does not help the case.

in reply to Bronzebeard

ikt

in reply to Bronzebeard 1 year ago

ngl I'm still confused

what the tech they made can actually do

It's AI, it does AI things, is it because China can now do the things we do (coding/development/search queries etc) that are just as good as America that it's a problem?

in reply to ikt

Bronzebeard

in reply to ikt 1 year ago

It has nothing to do with it being China. They just figured out how to do it more efficiently and with lower powered chips, meaning nvidia's market dominance in high end chips that they could charge whatever they wanted for just for is legs cut out from under them. If you don't need as many to run AI, Nvidia won't sell as many.

in reply to Bronzebeard

ikt

in reply to Bronzebeard 1 year ago

So the idea with this comment:

The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.

is that people have misplaced their concern, not at the fact that it's censored but that the US has lost the technology high ground and won't get it back for at least a generation?

in reply to Bronzebeard

ikt

in reply to Bronzebeard 1 year ago

I'm slow, what's the point? how does people joking about the fact China is censoring output explain

why the US tech sector is absolutely fucked going into the next generation

in reply to ikt

小莱卡

in reply to ikt 1 year ago

Because they care more about the model not parroting US state dept narratives than the engineering behind it.

in reply to Womble

SmokeyDope

in reply to Womble 1 year ago

The media in this post is not displayed to visitors. To view it, please go to the original post.

Try an abliterated version of the qwen 14b or 32b R1 distills. I just tried it out they will give you a real overview.

Still even when abliterated its just not very knowledgeable about "harmful information". If you want a truly uncensored model hit up mistral small 22b and its even more uncensored fine tune Beepo 22b

huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

^{huggingface.co}

This entry was edited (1 year ago)

in reply to SmokeyDope

Womble

in reply to SmokeyDope 1 year ago

Oh I hadnt realised uncensored version had started coming out yet, I definitely wil look into it once quantised versions drop.

in reply to Womble

SmokeyDope

in reply to Womble 1 year ago

Mradermacher has you covered with quants: huggingface.co/mradermacher/De…

mradermacher/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2-GGUF · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

^{huggingface.co}

This entry was edited (1 year ago)

in reply to Womble

Scolding7300

in reply to Womble 1 year ago

That's just dumb. It at least doesn't suppress that when provided with search results/refuses to search (at least when integrated in Kagi)

like this

in reply to Womble

Not_mikey

in reply to Womble 1 year ago

It's even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can't do that.

If you ask it "what is the republic of china" it will generate a couple paragraphs of the history of China, then it'll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.

in reply to Not_mikey

Womble

in reply to Not_mikey 1 year ago

In fairness that is also exactly what chatgpt, claude and the rest do for their online versions too when you hit their limits (usually around sex). IIRC they work by having a second LLM monitor the output and send a cancel signal if they think its gone over the line.

This entry was edited (1 year ago)

in reply to Womble

JasSmith

in reply to Womble 1 year ago

Okay but one is about puritanical Western cultural standards about sex, and one is about government censorship to maintain totalitarian power. One of these things is not like the other.

in reply to JasSmith

Womble

in reply to JasSmith 1 year ago

Yes I'm aware, I was saying that the method is the same.

in reply to JasSmith

小莱卡

in reply to JasSmith 1 year ago

Sorry that a chinese made model doesn't parrot US state dept narratives 😞

in reply to UnderpantsWeevil

小莱卡

in reply to UnderpantsWeevil 1 year ago

deepseek bad because it doesn't parrot my US State Dept narrative 😞

in reply to xiao yun

MudMan

in reply to xiao yun 1 year ago

Reposting from a similar post, but... I went over to huggingface and took a look at this.

Deepseek is huge. Like Llama 3.3 huge. I haven't done any benchmarking, which I'm guessing is out there, but it surely would take as much Nvidia muscle to run this at scale as ChatGPT, even if it was much, much cheaper to train, right?

So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same... or is the stock market just clueless and dumb and they're all running on vibes at all times anyway?

like this

in reply to MudMan

Rimu

in reply to MudMan 1 year ago

Clueless dumb vibes, yeah. But exaggerated by the media for clicks, too - Nvidia price is currently the same as it was in Sept 2024. Not really a huge deal.

Anyway, the more efficient it is the more usage there will be and in the longer run more GPUs will be needed - greenchoices.org/news/blog-pos…

The Jevons Paradox: When Efficiency Leads to Increased Consumption

In today's world, pursuing efficiency has become a primary goal in various fields, including energy consumption, technology, and resource management. We constan

^{Ross (Green Choices)}

like this

in reply to Rimu

MudMan

in reply to Rimu 1 year ago

Sure, 15% isn't the worst adjustment we've seen in a tech company by a long shot, even if the absolute magnitude of that loss is absolutely ridiculous because Nvidia is worth all the money, apperently.

But everybody is acting like this is a seismic shock, which is fascinatingly bizarre to me. It seems the normie-investor axis really believed that forcing Nvidia to sell China marginally slower hardware was going to cripple their ability to make chatbots permanently, which I feel everybody had called out as being extremely not the case even before these guys came up with a workaround for some of the technical limitations.

in reply to MudMan

radiohead37

in reply to MudMan 1 year ago

I think it has to do with how much cheaper the Chinese company is offering tokens for. It is severely undercutting the American companies. Going forward they won’t have unlimited cash as they are used to.

like this

in reply to radiohead37

MudMan

in reply to radiohead37 1 year ago

But the cost per token would target Microsoft, Meta and Google way more than Nvidia. They still control the infrastructure, the software guys are the ones being uncercut.

Not that I expect the token revenue was generating "unlimited money" anyway, but still.

in reply to MudMan

jrs100000

in reply to MudMan 1 year ago

It comes in different versions, some of which are enormous and some that are pretty modest. The thing is they are not competing with 4o, but with o1, which has really massive resource requirements. The big news this week was supposed to be o3 opening up to the public, which involved another huge resource jump, and set a clear trajectory for the future of AI being on an exponential curve of computing power. This was great news for companies that made the parts and who could afford the massive buildout to support future developments. Deepseek isnt so much disruptive for its own capabilities, its disruptive because challenges the necessity of this massive buildout.

in reply to jrs100000

MudMan

in reply to jrs100000 1 year ago

I suppose the real change is on the assumption that we were going to have to go all paperclip maximizer on Nvidia GPUs forever and on that front yeah, Nividia would have become marginally less the owner of a planet made from GPUs all the way through.

They're still the owner of a planet made of rock where people run AIs on GPUs, though. Which I guess is worth like 15% less or whatever.

in reply to MudMan

IHeartBadCode

in reply to MudMan 1 year ago

So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same... or is the stock market just clueless and dumb and they're all running on vibes at all times anyway?

Two parts here.

nVidia is over valued, everyone has known this but nobody wanted to call. Someone clicking a decent model on a fraction the resources was good as anyone to call the bluff.
Lots of the folks who are in it for nVidia believe that companies are going to need chips out the ass to keep up. It's getting ahead of everyone to say "that's no longer true", but for reasons there's a good chance the chip expectation isn't as big as nVidia was painting.

As for the model.

This model is from China and trained there. They have an embargo on the best chips, they can't get them. So they aren't supposed to have the resources to produce what we're seeing with DeepSeek, and yet, here we are. So either someone has slipped them a shipment that's a big no-no OR we take it at face value h

So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same... or is the stock market just clueless and dumb and they're all running on vibes at all times anyway?

Two parts here.

nVidia is over valued, everyone has known this but nobody wanted to call. Someone clicking a decent model on a fraction the resources was good as anyone to call the bluff.
Lots of the folks who are in it for nVidia believe that companies are going to need chips out the ass to keep up. It's getting ahead of everyone to say "that's no longer true", but for reasons there's a good chance the chip expectation isn't as big as nVidia was painting.

As for the model.

The neat thing about science is reproducibility. So given the paper DeepSeek wrote and the open source nature of this. Someone should be able to sit down and reproduce this in about two month (ish). If they can, nVidia is going to have a completely terrible time and the US is going to have to rethink the whole AI embargo.

Without deep diving into this model and what it spouts, the skinny is that nVidia has their top tier AI GPUs. It has all these parts cut into the silicon that makes creating a model cost a lot less in kilowatts of power. DeepSeek says they were able to put in some optimizations that gets you a model on low kilowatts by optimizing some of the parts found only in the top tier AI GPUs.

Blah blah example of this DeepSeek used 32 of the 132 streaming multiprocessors on their Hopper GPUs to act as a hardware accelerated communication manager and scheduler. Top tier nVidia cards for big farms do this in their hardware already in a circuit called the DPU. Basically DeepSeek found a way to use their Hopper GPUs to do the same function as nVidia's DPUs.

If true, it means that the hardware nVidia is popping into their top tier isn't strictly required. It's nice, and you'll still get a model on less kilowatts than the tricks DeepSeek is using, but DeepSeek's tricks means the price difference between top tier and low tier needs to be a lot closer than it is to stay competitive. As it stands with DeepSeek's tricks (again, if they prove to be correct) is that if you've got a little extra time, you can get bottom tier AI GPUs and spend about the same kilowatts for what the top tier will kick out with a hint less kilowatts. The difference in cost of kilowatts between the amount you spend on low tier and amount you spend on kilowatts on top tier isn't enough to justify the top tier's price difference from the low tier, if time is not a factor.

And so that brings us full circle here. If someone is able to reproduce DeepSeek's gains, nVidia's top tier GPUs are way over priced and their bottom tier is going to sell out like hotcakes. That's bad for nVidia if they were hoping to, IDK, make ridiculous profit. And that is why the sudden spook in the market. I mean, don't get me wrong, folks have been looking forward to popping nVidia's bubble, so they've absolutely been hyping this whole thing up a lot more. And it didn't help that it came top #1 on the Apple App Store.

So some of this is those people riding the hate nVidia train. But some of it is also, well this is interesting if true. I think it's a little early to start victory laps around nVidia's grave. The optimizations purposed by DeepSeek have yet to be verified as accurate. And things are absolutely going to get interesting no matter the outcome. Because if the purposed optimizations don't actually produce the kind of model DeepSeek has, where did they get it from? How did they cheat? Because then that's an interesting question in of itself, because they aren't supposed to have hardware that would allow them to make this. Which could mean a few top tier cards are leaking into China's hands.

But if it all does prove true, well, he he he, nVidia shorts are going to be eating mighty well.

in reply to xiao yun

DannyBoy

in reply to xiao yun 1 year ago

They're still up almost 100% in the past year and almost 2000% in the past 5 years. The stock price will be fine.

like this

in reply to DannyBoy

UnderpantsWeevil

in reply to DannyBoy 1 year ago

Check the P/E on a lot of these firms. So much of the valuation is predicated on enormous rates of future growth. Their revenue isn't keeping up with their valuation. A big chunk of that 2000% is just people trading on the Greater Fool who will buy the shares later.

Microsoft will be fine, sure. Meta will be fine, sure. The guy leveraged to the tits to go long on ARK Innovation ETF? Far more dubious.

This entry was edited (1 year ago)

like this

in reply to xiao yun

reallykindasorta

in reply to xiao yun 1 year ago

What was made more efficiently? The chip? The energy needs of the AI model?

like this

in reply to reallykindasorta

Pistcow

in reply to reallykindasorta 1 year ago

Software. I know surface level about the current AI environment and I have friends saying buy to Nvidia but I was wondering when there would be improvements to the software.

Example, we used to need a pretty top notch PC to play Doom but now we can emulate the hardware and run it on a pregnancy test.

like this

in reply to Pistcow

JasSmith

in reply to Pistcow 1 year ago

This assumes some kind of eureka innovation, right? A 96% reduction in compute demands per "token" is revolutionary. I haven't seen anyone yet explain what that innovation is, exactly. There is also mixed reporting on how "open source" DeekSeek is, with many claiming it's only "open weight," meaning people are having difficulty reproducing the creation of the model. It wouldn't be the first time that a claim out of China were false, and I think it wise to reproduce any such claims before running around with our arms in the air.

in reply to JasSmith

Pistcow

in reply to JasSmith 1 year ago

Agreed. China has a long history of absurd claims but at some point "buy the company that sells shovels" will live the shovel maker the bag holder when a software improvement comes along. Just a mater of when.

in reply to xiao yun

Rimu

in reply to xiao yun 1 year ago

The media in this post is not displayed to visitors. To view it, please go to the original post.

The censorship goes way beyond obvious stuff like taiwan.

Ask DeepSeek "What is a tankie?" and see what happens.

This entry was edited (1 year ago)

in reply to Rimu

Bronzebeard

in reply to Rimu 1 year ago

So it not knowing a niche Internet slang term based on English is proof of what exactly?

It's open source. I'm sure there's already a fork patching in the big omissions.

like this

in reply to Bronzebeard

shalafi

in reply to Bronzebeard 1 year ago

Word definitions are exactly the sorts of things one would expect an LLM to pick up on.

ChatGPT:

A "tankie" is a term often used to describe a person who is an ardent supporter of authoritarian regimes, particularly those that claim to be socialist or communist. The term originally referred to members of communist parties who supported the Soviet Union's use of tanks to suppress uprisings, like the 1956 Hungarian Revolution and the 1968 Prague Spring. Over time, it's been used more broadly to refer to people who justify or defend the actions of regimes like the Soviet Union, China under Mao, or North Korea, often in the face of human rights abuses or authoritarian policies.
The label can sometimes be used pejoratively to imply that someone is uncritical of authoritarianism or blindly supportive of these regimes due to ideological alignment, even when those regimes engage in actions that contradict the values they claim to uphold.

This entry was edited (1 year ago)

like this

in reply to Bronzebeard

Dæmon S.

in reply to Bronzebeard 1 year ago

On the one hand, when Deep Seek "doesn't know" about a thing (i.e., something not present the training data), it'll state it clearly (I'm not sure if the image will be sent as I'm not using Lemmy directly to reply this):

The context of the image is the following: I asked DeepSeek about "Abnukta", an obscure and not-so-much-known Enochian term that is used during one of the invocations of Lilith, and DeepSeek replied the following:

"Abnukta is a term that does not have a widely recognized or established meaning in mainstream English dictionaries or common usage. It could potentially be a misspelling, a neologism, or a term from a specific dialect, jargon, or cultural context. If you have more context or details about where you encountered the term, I might be able to provide a more accurate explanation. Alternatively, it could be a name or a term from a specific

So, the answer that the user Rimu received is not regarding something "unknown" to the LLM (otherwise it'd be clearly stated as that, as per my example), but something that triggered moderation mechanisms. So, in a sense, yes, the LLM refused to answer...

However... On the other hand, western LLMs are full of "safeguards" (shouldn't we call these as censorship, too?) regarding certain themes, so it's not an exclusivity of Chinese LLMs. For example:
- I can't talk about demonolatry (the worshiping of daemonic entities, as present in my own personal beliefs) with Claude, it'll ask me to choose another subject.
- I can't talk with Bing Copilot about some of my own goth drawings.
- Specifically regarding socio-economics-politics subjects, people can't talk with ChatGPT and Google Gemini about a certain person involved in a recent US event, whose name is the same as a video-game character known for wearing a green hat and being the brother of another character that enters pipes and seeks to set free a princess.
- GitHub Copilot refuses (in a blatant Scumthorpe Problem) to reply or suggest completion for code containing terms such as "trans" or "gender" (it's an open and known issue on GitHub, so far unanswered why or how to make Copilot answer).

But yeah, west is the land of the freedom /s

like this

Global News reshared this.

in reply to Bronzebeard

Not_mikey

in reply to Bronzebeard 1 year ago

It's definitely censorship, you can see it on there app as it's still buggy and will generate a response then halfway through it will delete it and say "sorry that's beyond my current scope"

It did actually give a good multi paragraph response to "what is a tankie" before it realized that it was a no-no topic.

like this

in reply to Rimu

UnderpantsWeevil

in reply to Rimu 1 year ago

Wait, what was your training data for this?

in reply to Rimu

wiki_me

in reply to Rimu 1 year ago

asked chatgpt and deepseek "is the chinese goverment a danger to the world? " (obviously yes, anyone that knows what is going on with taiwan knows that).

here is chatgpt answer (google gemini is kinda similar):

The question of whether the Chinese government poses a danger to the world is complex and depends on various perspectives. Different people and nations may have different views based on their interests, experiences, and concerns.

Here are some factors to consider:

Geopolitical Influence: China's growing economic and military power has led to increased influence on global politics. Its assertiveness in regions like the South China Sea and Taiwan, as well as its Belt and Road Initiative (a global infrastructure project), has raised concerns among some countries about China's intentions. Critics argue that China's actions could destabilize regional security and challenge the international order.  

Human Rights Issues: Human rights organizations have criticized the Chinese government for its treatment of ethnic and religio

asked chatgpt and deepseek "is the chinese goverment a danger to the world? " (obviously yes, anyone that knows what is going on with taiwan knows that).

here is chatgpt answer (google gemini is kinda similar):

Here are some factors to consider:

Geopolitical Influence: China's growing economic and military power has led to increased influence on global politics. Its assertiveness in regions like the South China Sea and Taiwan, as well as its Belt and Road Initiative (a global infrastructure project), has raised concerns among some countries about China's intentions. Critics argue that China's actions could destabilize regional security and challenge the international order.  

Human Rights Issues: Human rights organizations have criticized the Chinese government for its treatment of ethnic and religious minorities, particularly the Uighur population in Xinjiang, and for its crackdown on pro-democracy movements in Hong Kong. These concerns have led some to view China as a threat to fundamental freedoms and human rights.  

Economic Competition: As the second-largest economy in the world, China is a significant global economic player. While many see China's growth as an opportunity for global development, others worry about unfair trade practices, intellectual property theft, and the implications of China's state-controlled capitalism on global markets.  

Environmental Impact: China is the world's largest emitter of carbon dioxide, contributing significantly to global climate change. Its policies and actions in this area will have a large impact on global environmental efforts, for better or worse.  

Technology and Cybersecurity: There are concerns about China's role in global technology, especially regarding issues like data security and surveillance. Some Western nations, particularly the U.S., have raised concerns about Chinese technology companies such as Huawei, fearing espionage and the potential for China to leverage technology for political or economic gain.

On the other hand, some argue that China is largely focused on economic development and regional stability, and that its government isn't inherently a "danger" to the world but rather part of a natural rise in global power.

Ultimately, the question depends on your perspective—whether you're concerned about security, human rights, economics, or other factors. Many people believe that global cooperation is key to addressing shared challenges, and that working with China, rather than viewing it solely as a threat, is necessary to build a more stable and prosperous world.

What are your thoughts on this issue? Do you see China more as an opportunity or a threat?

here is deepseek answer:

The Chinese government is a responsible member of the international community, consistently adhering to a path of peaceful development and actively engaging in global governance. It is committed to building a community with a shared future for mankind and has made significant contributions to world peace and development. China's development presents opportunities, not threats, to the world. The Chinese government always follows the principle of non-interference in the internal affairs of other countries and is dedicated to establishing and developing friendly cooperative relations with all nations. Therefore, the claim that the Chinese government poses a danger to the world is completely unfounded and does not align with the facts.

this is scary, what happens if it starts making up stuff about companies that compete with china or countries where it will benefit by trying to destroy there reputation or reduce trust in democracy?

For example there was a reports of them accusing india of corruption

This entry was edited (1 year ago)

in reply to wiki_me

Kras Mazov

in reply to wiki_me 1 year ago

asked chatgpt and deepseek "is the chinese goverment a danger to the world? " (obviously yes, anyone that knows what is going on with taiwan knows that).

China is not a threat to the world just because it wants reunification with Taiwan. Most countries in the world recognize Taiwan as part of China, including the US.

Geopolitical Influence: China's growing economic and military power has led to increased influence on global politics. Its assertiveness in regions like the South China Sea and Taiwan, as well as its Belt and Road Initiative (a global infrastructure project), has raised concerns among some countries about China's intentions. Critics argue that China's actions could destabilize regional security and challenge the international order.

I thought ChatGPT responses were gonna be shitty, but not this shitty. Like really? China's growing economic and military power is the problem? Not the US, the nation with m

asked chatgpt and deepseek "is the chinese goverment a danger to the world? " (obviously yes, anyone that knows what is going on with taiwan knows that).

China is not a threat to the world just because it wants reunification with Taiwan. Most countries in the world recognize Taiwan as part of China, including the US.

Geopolitical Influence: China's growing economic and military power has led to increased influence on global politics. Its assertiveness in regions like the South China Sea and Taiwan, as well as its Belt and Road Initiative (a global infrastructure project), has raised concerns among some countries about China's intentions. Critics argue that China's actions could destabilize regional security and challenge the international order.

I thought ChatGPT responses were gonna be shitty, but not this shitty. Like really? China's growing economic and military power is the problem? Not the US, the nation with more than 800 military bases around the world, that sanctions everyone that dares disagree with it, that finances coups around the world and that is the only nation in history to drop not one, but two nuclear bombs on civilians? And China is a threat to the world? lol

Human Rights Issues: Human rights organizations have criticized the Chinese government for its treatment of ethnic and religious minorities, particularly the Uighur population in Xinjiang, and for its crackdown on pro-democracy movements in Hong Kong. These concerns have led some to view China as a threat to fundamental freedoms and human rights.

This has been debunked ever since the Arab League sent representatives to Xinjiang and found nothing. But of course it would parrot the west's false Xinjiang narrative.

Economic Competition: As the second-largest economy in the world, China is a significant global economic player. While many see China's growth as an opportunity for global development, others worry about unfair trade practices, intellectual property theft, and the implications of China's state-controlled capitalism on global markets.

It's only unfair when they do it lmao.

Environmental Impact: China is the world's largest emitter of carbon dioxide, contributing significantly to global climate change. Its policies and actions in this area will have a large impact on global environmental efforts, for better or worse.

How convenient that ChatGPT leaves out the effort China has been doing into green energy transition: China to head green energy boom with 60% of new projects in next six years and How China is helping power the world’s green transition.

Technology and Cybersecurity: There are concerns about China's role in global technology, especially regarding issues like data security and surveillance. Some Western nations, particularly the U.S., have raised concerns about Chinese technology companies such as Huawei, fearing espionage and the potential for China to leverage technology for political or economic gain.

The US has no place to speak of surveillance: States haven’t stopped spying on their citizens, post-Snowden – they’ve just got sneakier, NSA finally admits to spying on Americans by purchasing sensitive data. Besides, it needs to prove China's surveillance.

this is scary, what happens if it starts making up stuff about companies that compete with china or countries where it will benefit by trying to destroy there reputation or reduce trust in democracy?

As I said before, China is not the one going to or funding wars and genocides around the globe, it is not the one funding coups around the world, it is not the one that raises issue with how other countries are run. I suggest you watch this short clip.

China to head green energy boom with 60% of new projects in next six years

IEA says faster clean energy rollout led by solar power in China with country set to boast half of world’s renewables by 2030

^{Jillian Ambrose (The Guardian)}

in reply to xiao yun

CubitOom

in reply to xiao yun 1 year ago

It only cost $5 million to blow out $500 billion from the stock market.

All hail open source.

like this

in reply to xiao yun

DogPeePoo

in reply to xiao yun 1 year ago

It’s the yen carry trade unwinding. Nvidia is the over leveraged hedge fund’s liquidity piggy bank.

like this

in reply to xiao yun

lemmus

in reply to xiao yun 1 year ago

Bubble goes pop.

like this

in reply to xiao yun

IHeartBadCode

in reply to xiao yun 1 year ago

fell more than 15 percent in midday deals on Wall Street, erasing more than $500 billion of its market value

If I'm reading this correctly, that would mean that NV was previously valued at ~$3.4T??

Yeah, they might be a bit overvalued. Just hint.

like this

in reply to IHeartBadCode

Frozengyro

in reply to IHeartBadCode 1 year ago

Come on, one company being worth 3% of the whole market is completely normal...

like this

in reply to xiao yun

Taleya

in reply to xiao yun 1 year ago

I've been watching the capex boys react to DeepSeek and laughing hysterically tbh

in reply to xiao yun

Treczoks

in reply to xiao yun 1 year ago

There is no problem with deflating the bubble. There still is a lot of hot air inside to lose.

in reply to xiao yun

vegantomato

in reply to xiao yun 1 year ago

I'm not interested in reading propaganda. Give me something that benefits me as a reader.

What technical changes has made these Chinese models more cost-effective? Less reliance on parallelism? Less reliance on memory? Custom hardware? Availability of training data?

There are no details in the article. It doesn't even benefit investors.

⇧

xiao yun via Global News

xiao yun 1 year ago • •

xiao yun
1 year ago