polite leftists make more leftists

☞ 🇨🇦 (it’s a bit of a fixer-upper eh) ☜

more leftists make revolution

  • 4 Posts
  • 425 Comments
Joined 1 year ago
cake
Cake day: March 2nd, 2024

help-circle


  • I mean, among people who are proficient with IPA, they still struggle to read whole sentences written entirely in IPA. Similarly, people who speak and read chinese struggle to read entire sentences written in pinyin. I’m not saying people can’t do it, just that it’s much less natural for us (even though it doesn’t really seem like it ought to be.)

    I agree that LLMs are not as bright as they look, but my point here is that this particular thing – their strange inconsistency understanding what letters correspond to the tokens they produce – specifically shouldn’t be taken as evidence for or against LLMs being capable in any other context.


  • When we see LLMs struggling to demonstrate an understanding of what letters are in each of the tokens that it emits or understand a word when there are spaces between each letter, we should compare it to a human struggling to understand a word written in IPA format (/sʌtʃ əz ðɪs/) even though we can understand the word spoken aloud normally perfectly fine.




  • Well – and I don’t meant this to be antagonistic – I agree with everything you’ve said except for the last sentence where you say “and therefore you’re wrong.” Look, I’m not saying LLMs function well, or that they’re good for society, or anything like that. I’m saying that tokenization errors are really their own thing that are unrelated to other errors LLMs make. If you want to dunk on LLMs then yeah be my guest. I’m just saying that this one type of poor behaviour is unrelated to the other kinds of poor behaviour.



  • in what context? LLMs are extremely good at bridging from natural language to API calls. I dare say it’s one of the few use cases that have decisively landed on “yes, this is something LLMs are actually good at.” Maybe not five nines of reliability, but language itself doesn’t have five nines of reliability.


  • The claim is not that all LLMs are agents, but rather that agents (which incorporate an LLM as one of their key components) are more powerful than an LLM on its own.

    We don’t know how far away we are from recursive self-improvement. We might already be there to be honest; how much of the job of an LLM researcher can already be automated? It’s unclear if there’s some ceiling to what a recursively-improved GPT4.x-w/e can do though; maybe there’s a key hypothesis it will never formulate on the quest for self-improvement.




  • I suppose if you’re going to be postmodernist about it, but that’s beyond my ability to understand. The only complete solution I know to Theseus’ Ship is “the universe is agnostic as to which ship is the original. Identity of a composite thing is not part of the laws of physics.” Not sure why you put scare quotes around it.



  • Hallucinations aren’t relevant to my point here. I’m not defending that AIs are a good source of information, and I agree that hallucinations are dangerous (either that or misusing LLMs is dangerous). I also admit that for language learning, artifacts caused from tokenization could be very detrimental to the user.

    The point I am making is that LLMs struggling with these kind of tokenization artifacts is poor evidence for drawing any conclusions about their behaviour on other tasks.


  • Because LLMs operate at the token level, I think it would be a more fair comparison with humans to ask why humans can’t produce the IPA spelling words they can say, /nɔr kæn ðeɪ ˈizəli rid θɪŋz ˈrɪtən ˈpjʊrli ɪn aɪ pi ˈeɪ/ despite the fact that it should be simple to – they understand the sounds after all. I’d be impressed if somebody could do this too! But that most people can’t shouldn’t really move you to think humans must be fundamentally stupid because of this one curious artifact. Maybe they are fundamentall stupid for other reasons, but this one thing is quite unrelated.



  • Congrats, you’ve discovered reductionism. The human brain also doesn’t know things, as it’s composed of electrical synapses made of molecules that obey the laws of physics and direct one’s mouth to make words in response to signals that come from the ears.

    Not saying LLMs don’t know things, but your argument as to why they don’t know things has no merit.








OSZAR »