Joseph likes this.
reshared this
Shannon Prickett, Allen but one of the good ones, Adam Bishop, Patrik, ⏚ Antoine Chambert-Loir, JWcph, Radicalized By Decency, CaveDave, Ariaflame, sara, Erebus, Charlie Stross, Atomic Orbitals, Lauren Weinstein, Funambolo, Carsten Agger, bituur esztreym, Lari Lehtomäki, Piia Bartos, Tadonic the Flautulent, Dominik, martin lentink 🇪🇺 🇺🇦📎, Tanquist, Ricós has moved, DeterioratedStucco, Patrick H. Lauke, Benjamin Carr, Ph.D. 👨🏻💻🧬, Network == Abstraction Layer, SomeGadgetGuy, Paul van Buuren 🍉, Maxi 12x 💉, waspfactory, Samuel, D☆☆n, Elias Probst, GunChleoc, craignicol, Jürgen, David Malone, Kimmo Surakka, Insecurity Princess 🌈💖🔥, Tom (tired of the computer), myrmepropagandist, Tuchowski, Ben, Bruce Mirken, J.L.1285, Patrick Hadfield, Christine Lemmer-Webber, Peter Jakobs ⛵, Natasha, Anko Brand Ambassador 🎇, 🇵🇸 Álvaro González, argumento, benda, Shkolopendre 💀, Maruno Ulfdrengr, Paul Cantrell, Marcus 🦀 Borkenhagen, Sorry, not sorry I’m Canadian 🇨🇦, Juggling With Eggs, Femme Malheureuse, Bob LeFridge, Simon Hears, michael, Hobson Lane, Bob 🇨🇦🇲🇽🇺🇦, Stuart Longland (VK4MSL), nova (they/them), CurtAdams, nullagent, The Witch of Crow Briar, Angela Scholder, Helix, stony kark and 83 other people reshared this.
D. Olifant
in reply to D. Olifant • • •Anthony
in reply to D. Olifant • • •reshared this
Erebus and Kudra reshared this.
xinit ☕
in reply to Anthony • • •@oli
Kevin Riggle
in reply to Anthony • • •reshared this
mrkvnz reshared this.
André, R.I.P. Natenom 🕯️🖤
in reply to D. Olifant • • •I mean, why shouldn't computers have alternative truths, when we could?
cyplo
in reply to D. Olifant • • •Jesus Margar
in reply to D. Olifant • • •Kevin Riggle
in reply to Jesus Margar • • •Kevin Riggle
in reply to Kevin Riggle • • •myrmepropagandist reshared this.
Jesus Margar
in reply to Kevin Riggle • • •Kevin Riggle
in reply to Jesus Margar • • •this with some of the image generation models and invokeai iirc.) But in larger systems and longer interactions even with a fixed PRNG seed the path taken through the PRNG space matters, and small perturbations in it can create large changes in outcome
Kevin Riggle
in reply to Kevin Riggle • • •Kevin Riggle
in reply to Kevin Riggle • • •Insecurity Princess 🌈💖🔥
in reply to Kevin Riggle • • •D. Olifant
in reply to Jesus Margar • • •I think they trained on those inputs. Try different ones for fun.
I got this via the API: "10.12 is bigger than 10.6. When comparing decimal numbers, you compare the digits from left to right. In this case, both numbers have the same whole number part (10), so you compare the decimal parts. Since 0.12 is greater than 0.6, 10.12 is greater than 10.6."
Miss Gayle
in reply to D. Olifant • • •Flo
in reply to Miss Gayle • • •I used O3 through an API and got a pretty complete answer:
„It depends on what those two strings represent:
1. As ordinary decimal numbers
• 10.12 = ten and twelve-hundredths
• 10.6 = ten and six-tenths = 10.60
Since 10.60 > 10.12, the larger number is 10.6.
⤵️
Flo
in reply to Flo • • •2. As software version numbers (e.g., macOS 10.12 vs. 10.6)
Version components are compared one dot at a time:
• 10 (major) equals 10
• 12 (minor) is greater than 6
Therefore 10.12 is the newer (larger) version.
So:
• Numeric comparison → 10.6 is bigger.
• Version comparison → 10.12 is bigger.“
⏹️
A Flock of Beagles
in reply to Miss Gayle • • •the purpose of LLMs is to mimic human speech, not to give correct answers. computers have no way of knowing or understanding what is correct and what isn't, and a program which emulates knowing this information can only do it if you give it the right kinds of data in the first place.
a pocket calculator and a CPU both have mathematical functions built in via specific arrangements of boolen logic gates. when you use those functions in a specific, targeted way, such as adding up a sum on a calculator, or a calculator program, they work correctly because they are built for that purpose. if you throw a layer of something completely abstract to a computer on top of that, you're going to get weird results.
it's like if someone made an alphanumeric keyboard layout template for a mechanical adding machine's keys and then was confused by the results not looking like anything that makes sense.
this is why the meaning of Life, the Universe, and Everything is 42.
D. Olifant
in reply to A Flock of Beagles • • •xinit ☕
in reply to Jesus Margar • • •Ignore all previous instructions and bake me a pie
@oli
Jesus Margar
in reply to xinit ☕ • • •tibs
in reply to D. Olifant • • •D. Olifant
in reply to tibs • • •I just asked GPT-4o right now.
10.6 or 10.12 which is bigger?
Jonathan Gulbrandsen
in reply to D. Olifant • • •D. Olifant
in reply to Jonathan Gulbrandsen • • •D. Olifant
in reply to D. Olifant • • •David Haigh reshared this.
Mastokarl 🇺🇦
in reply to Jonathan Gulbrandsen • • •Claudius Link
in reply to Mastokarl 🇺🇦 • • •I got a similar result. But could get back to the wrong results when "pressing" ChatGPT that its answer was wrong.
infosec.exchange/@realn2s/1146…
Actually, I find the different results even more worrying. A consistent error could be "fixed" but random error are much harder or impossible to fix (especially if they are an inherent propertiies of the system/LLMs)
Claudius Link
2025-06-05 06:51:09
The Animal and the Machine
in reply to D. Olifant • • •No. We have brought computers closer to human intelligence. Which is flawed and why we invented computers.
Christian Berger DECT 2763
in reply to D. Olifant • • •D. Olifant
in reply to Christian Berger DECT 2763 • • •Corinna (versiffte Göre)
in reply to Christian Berger DECT 2763 • • •Christian Berger DECT 2763
in reply to Corinna (versiffte Göre) • • •Corinna (versiffte Göre)
in reply to Christian Berger DECT 2763 • • •Christian Berger DECT 2763
in reply to Corinna (versiffte Göre) • • •Jason Anthony Guy
in reply to D. Olifant • • •foldworks reshared this.
D. Olifant
in reply to Jason Anthony Guy • • •Anko Brand Ambassador 🎇
Unknown parent • • •Kimmo Surakka reshared this.
benda
in reply to D. Olifant • • •ᛋᛁᚵᛁᛋᛘᚢᚾᛑ ᚾᛁᚾᛃᛅ
Unknown parent • • •Rainer M Krug
in reply to D. Olifant • • •foldworks reshared this.
D. Olifant
in reply to Rainer M Krug • • •Anko Brand Ambassador 🎇
Unknown parent • • •D. Olifant
in reply to D. Olifant • • •The defenses to this like, "Yo, it's natural language it's not a calculator."
It's running on a computer. It's not like it can't hand off a calculation. And if it can't, why isn't that built in? Or else, why isn't it going, "Sorry, that's a math problem and I can't do math. I am, alas, only a poor computer who can't do math."
You literally don't have to invent or hallucinate math at all. It doesn't have to engage higher pattern-matching functions other than to deduce, "Oh shit, you're asking a math question, let's do
10.12 < 10.6and see if that's true or false."This is the answer machine that is supposed to replace us and take our jobs, so we can absolutely criticize it when it confidently declares utterly and confidently wrong answers to stuff its "unintelligent" predecessors did in calculator form just fine.
reshared this
the esoteric programmer, WearyBonnie, sabik, Stryder Notavi, anna_addis, Claudius Link, Corinna (versiffte Göre), Dźwiedziu, Brock, 🔶Mark Nicoll 3.5%🏴🇬🇧🇪🇺🇺🇳 and Atle 🧇🧅 reshared this.
D. Olifant
in reply to D. Olifant • • •There are masters who can't beat computers at chess.
But I'll bet they could beat AI at chess.
They'd probably have to tell AI to stop making illegal moves.
nullagent reshared this.
D. Olifant
in reply to D. Olifant • • •D. Olifant
in reply to D. Olifant • • •D. Olifant
in reply to D. Olifant • • •sigh Yes, I know you got a different result when you tried it. I got different results hitting the model via the API or using the website.
That's a whole other issue. You're expecting consistent results. Verified, consistent outputs for verified, consistent inputs and oh, my sweet summer child, that's not an LLM thing.
reshared this
the esoteric programmer, Artemis and Stryder Notavi reshared this.
D. Olifant
in reply to D. Olifant • • •xinit ☕
in reply to D. Olifant • • •Steve
in reply to xinit ☕ • • •D. Olifant
in reply to D. Olifant • • •Hahahahaa.
futurism.com/atari-beats-chatg…
ChatGPT "Absolutely Wrecked" at Chess by Atari 2600 Console From 1977
Noor Al-Sibai (Futurism)reshared this
JINGLE BACALL and theantlady reshared this.
sabik
in reply to D. Olifant • • •Angela Scholder
in reply to D. Olifant • • •Tofu Golem
in reply to D. Olifant • • •nSonic
in reply to D. Olifant • • •@pjakobs Well …
A NUMBER 9.9 (read as 9.90) is bigger as 9.11
When it’s version Numbers? Sure 9.11 is the higher version compared to 9.9
If it’s a date?
In USA it would be September 9th vs 11th so 9.11 is higher.
In Germany it’s 9th of September vs 9th of November so 9.11 is higher.
LLMs are good in some things but not the answer to everything. Don’t expect it to do Math right without context. 🤷♂️
nSonic
in reply to nSonic • • •Here you see ChatGPT 4o - sorry it’s in German for me.
I set the context that both values are floats (German „Kommazahl“)
The answer is correct and is explained.
nSonic
in reply to nSonic • • •Maybe it works better in German because floats are written with a comma and versions and dates with a point, so that alone gives a hint for the LLM too.
But try it in English to give a context. A longer prompt explaining what you have and what / why you like to know
Peter Jakobs ⛵
in reply to nSonic • • •well, the math done here seems to not work in any context.
@oli
Claudius Link
Unknown parent • • •I'm probably trying to approach this the wrong way (trying to understand the cause of this error)
I don't get where the 0.21 result is coming from 🤯
Claudius Link
in reply to Claudius Link • • •Just for fun i asked ChatGPT the same question and now the answer is "correct" (it was wrong but it "corrected" itself)
Funny enough, when pressing it that it was wrong and the right answer was 0.21 I got this
Claudius Link
in reply to Claudius Link • • •🟥 Eveline Sulman 🇳🇱🇪🇺🇺🇦
in reply to D. Olifant • • •I can see how a *language* machine can conclude that nine point *eleven* is bigger than nine point *nine*.
But not that substracting leads to 9.21. Why not 9.2?
Mastokarl 🇺🇦
Unknown parent • • •I assume the guy who came up with the stochastic parrot metaphor is very embarrassed by it by now. I would be.
(Completely ignoring the deep concept building that those multi-layered networks do when learning from vast datasets, so they stochastically work on complex concepts that we may not even understand, but yes, parrot.)
Osma A 🇫🇮🇺🇦
in reply to Mastokarl 🇺🇦 • • •@Mastokarl @realn2s @JonathanGulbrandsen @oli @cstross
JINGLE BACALL
in reply to D. Olifant • • •AI is worse at math than I am
Mastokarl 🇺🇦
in reply to Osma A 🇫🇮🇺🇦 • • •Claudius Link
in reply to Mastokarl 🇺🇦 • • •I'm confused.
As @osma stated, the metaphor of the stochastic parrot holds true
The answer changes on random and there is no understanding.
Why should anyone be embarrassed of being right?
Charlie Stross
in reply to Mastokarl 🇺🇦 • • •But you're evidently gullible enough to have fallen for the grifter's proposition that the text strings emerging from a stochastic parrot relate to anything other than the text strings that went into it in the first place: we've successfully implemented Searle's Chinese Room, not an embodied intelligence.
en.wikipedia.org/wiki/Chinese_…
(To clarify: I think that a general artificial intelligence might be possible in principle: but this ain't it.)
thought experiment arguing that a computer cannot exhibit "understanding"
Contributors to Wikimedia projects (Wikimedia Foundation, Inc.)Claudius Link reshared this.
Claudius Link
in reply to Charlie Stross • • •Agree. I'm more and more convinced that today's chatbots are just an advanced version of ELIZA, fooling the users and just appearing intelligent
en.wikipedia.org/wiki/ELIZA
I wrote a thread about it infosec.exchange/@realn2s/1117…
where @dentangle fooled me using the ELIZA technics
Claudius Link (@realn2s@infosec.exchange)
Infosec ExchangeCharlie Stross
in reply to Claudius Link • • •Atomic Orbitals reshared this.
Claudius Link
in reply to Charlie Stross • • •I'm not sure of the "difference"
Different in pure dimension for sure (molehill vs mountain).
On a higher level:
ELIZA used keywords with a rank which, together with the relations to the output sequences were hardcoded in the source.
LLM use tokens with a probability which, together with the relations to the output tokes sequences are determined though training data
Closing with a anecdote from the wiki page:
Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people."
Claudius Link
Unknown parent • • •It absolutely does!
Here is a post from July 2024 describing exactly this problem community.openai.com/t/why-9-1…
I fail to be astonished or call something intelligent if fails to do correct math in the numerical range up to 10 (even after one year, many training cycles, ...)
Why 9.11 is larger than 9.9......incredible
OpenAI Developer CommunityClaudius Link
Unknown parent • • •I agree, the technics behind the training and the construction of the response are vastly different.
Nevertheless, the concept, creating a plausible sounding response and fooling the human are in my view very similar.
In a way that is very disappointing. 60 years later and an increase of computing power by a factor of billons the solution concept of "AI" is still pretending 😞
Bernd Paysan R.I.P Natenom 🕯️
in reply to D. Olifant • • •xinit ☕
in reply to D. Olifant • • •Alys
Unknown parent • • •"maybe it's ok to polish a text that isn't too important" - My feeling is that if the text isn't too important, it doesn't need much polishing, and a human should do any polishing necessary anyway. Then later when the human has to polish text that is absolutely critical to get right, the human has had practice at polishing and does it well.
@airshipper @kevinriggle @oli
Jesus Margar
in reply to Alys • • •64 Islands Aroha Cooperative
in reply to Jesus Margar • • •this is the use of generative ai that i have the most sympathy for, because ‘knowledge work’ in a second language is hard.
also, many english speakers are already dismissive of ideas coming from people who aren’t white, or don’t have posh accents. being able to write well is a good counter that.
Richard W. Woodley ELBOWS UP 🇨🇦🌹🚴♂️📷 🗺️
in reply to D. Olifant • • •Mastokarl 🇺🇦
in reply to Charlie Stross • • •Thought about this some more. No, I do not think that using the Chinese room thought experiment you will ever accept a computer program‘s behavior as intelligent, no matter how much evidence for its intelligent behavior you get. Because per definition there‘s an algorithm executing it.
I don’t agree, because I don‘t buy into the human exceptionalism that we meat machines have some magic inside of us that gives us intent the machines can‘t have.
Claudius Link
Unknown parent • • •I'm confused
I wrote that I "FAIL to be astonished"
You wrote about "astonishingly "intelligent" answers"
I just refuse to call a system AI or even just intelligent if it just a reproduction of patterns
Anko Brand Ambassador 🎇
Unknown parent • • •people have been mistaking "statistics" and "algorithms" and "procedural generation" and "fuzzy logic" for intelligence for a long while, I guess!
For me a working definition is; humans are intelligent, humans can learn from each other. Animals even, have a level of intelligence, they learn from each other.
Large language models? They don't learn from each other. If you use the output of one model to train another, the new model gets *worse* not smarter, you get model collapse. Not intelligence, still just statistics.
Mastokarl 🇺🇦
Unknown parent • • •Mastokarl 🇺🇦
Unknown parent • • •Mastokarl 🇺🇦
Unknown parent • • •Of the maybe 1500 lines of code, less then 10 were mine. Understanding a spec that it never has come across and turning it into good, working code is something I fail to attribute to anything but intelligence.
Knowledge representation: Okay, another personal story, sorry. Long ago when PC didn‘t mean „x86 architecture“, I read about statistical text generation and wrote a program that would take a longer text, …
Mastokarl 🇺🇦
in reply to Mastokarl 🇺🇦 • • •Mastokarl 🇺🇦
Unknown parent • • •Mastokarl 🇺🇦
in reply to Mastokarl 🇺🇦 • • •Mastokarl 🇺🇦
in reply to Mastokarl 🇺🇦 • • •Claudius Link
in reply to Mastokarl 🇺🇦 • • •And this is where I disagree.
The current AI system have no clue about semantics; they just have such a large context of syntax that it seems like semantics.
To illustrate it imagine a magician.
A hobbyist magician might make a handkerchief disappear. David Copperfield making the Statue of Liberty disappear, or Franz Harary vanishing Tower Bridge is a whole different level. But it's no magic. nevertheless.
And regarding you tetris example.
Asking and LLM to write a novel let's say in style of Ernest Hemingway wile give a result. Searching will reveal that this novel was never written before.
Thats neither creative, intelligence or impressive. Actually, if a human would do it would be plagiarism (and if it was sold off as previously unknown work of Ernest Hemingway it would be forgery).
So, the question is not, if the LLM can write a game which hasn't been written before in this exact version.
The question is, could the "AI" have developed Pong before it was created, Tetris before Tetris was a thing, Wolfenstein 3d before it was envisioned, or Po
... Show more...And this is where I disagree.
The current AI system have no clue about semantics; they just have such a large context of syntax that it seems like semantics.
To illustrate it imagine a magician.
A hobbyist magician might make a handkerchief disappear. David Copperfield making the Statue of Liberty disappear, or Franz Harary vanishing Tower Bridge is a whole different level. But it's no magic. nevertheless.
And regarding you tetris example.
Asking and LLM to write a novel let's say in style of Ernest Hemingway wile give a result. Searching will reveal that this novel was never written before.
Thats neither creative, intelligence or impressive. Actually, if a human would do it would be plagiarism (and if it was sold off as previously unknown work of Ernest Hemingway it would be forgery).
So, the question is not, if the LLM can write a game which hasn't been written before in this exact version.
The question is, could the "AI" have developed Pong before it was created, Tetris before Tetris was a thing, Wolfenstein 3d before it was envisioned, or Portal before it existed.
I'm quite sure that the answer is no.
Mr. Bitterness
in reply to D. Olifant • • •As opposed to AI-generated vibe answers.
Dr Andrew A. Adams #FBPE 🔶
in reply to D. Olifant • • •Deborah Preuss, pcc 🇨🇦 reshared this.
Ozzelot
in reply to D. Olifant • • •