...
If I can say "load the dishwasher" and the LLM powered humanoid robot successfully starts loading the dishwasher, I think the point is moot, especially if it's never loaded a dishwasher before.
The reification of the language to behavior says that it understood the language.
If it makes a mistake, and I say "did you make a mistake" and it says "yes" and I say "what mistake did you make" and it says "I forgot to add the soap", I think damn well it might display some "understanding", too.
Voice-operated machinery existed a long time before LLMs came on the scene. The "reification of behavior" is absolutely not an indication that it understands language, only that it is programmed to respond to a limited set of verbal cues. Since you have not studied what it means to "understand" a linguistic expression, you are easily impressed by simulated acts of responses to verbal cues.
See:
The Octopus Test for Large Language Model AIs
I see you replied to one part while ignoring the whole *it fucking solves word problems* part.
I reacted to the part of your reply to bilby that I think you got wrong--your "reification of language behavior" claim. Mimicry is intelligent responses to human expressions is not the same thing as language understanding. The Octopus Test link explains why. Perhaps it would help if you reflected on the nature of
reification fallacies. An anthropomorphic fallacy (aka
pathetic fallacy) is a type of reification fallacy.
Interestingly, this is addressed in the post that began this exchange:
Much like a scientist publicly stating that they believe in a particular psychic, their self-image becomes intertwined with their belief in that psychic. Any dismissal of the phenomenon will feel to them like a personal attack.
As I have said, it's a no-true-scotsman to call *actually doing the task* "mere mimicry".
At some point you have to accept that regardless of what actually happens inside the box, observing the box yields the expected results: it solves the word problems.
You both commit *anthropocentric* fallacies. Specifically the fallacy of tying some thing to some specific reification, the reification that humans engender specifically, here.
I have not made any appeal to human-like-ness but rather that humans have something-like-ness that AI also happens to have.
These are not the same thing.
My assertion is that humans are 100% "linguistic". The human activity, and in fact the activity of all agents, is to develop vocabularies about how things are same and different.
Most life, including humans, additionally have structures built up inside them which share parity to the vocabularies we build, and when these resonate against some rhythm of access to our internal token structures, we have for us assigned some evolved intuition on the subject that interconnects to much of our other innate vocabularies.
Everything from the statements made by people as they walk in a cluster, to our facial expressions, to the specific neurons being excited by looking at a thing, we have vocabularies which connect shapes and patterns of sensation to some much higher dimensional vector representation of that data, wherein the vector is composed of some pattern-component and strength along a wide cross section of neurons.
This itself forms a sort of descriptive vocabulary within the neural system at various parts.
But at that point, there's nothing differentiating this kind of information and interaction from what happens in a transformer model with attention, albeit I expect that much of the process is very wasteful because of the ridiculous ways pseudo-recursion and pseudo-self-review would have to function in such a system.
We both act as information integrators, and the information we integrate is ultimately broken down into vectors and features, for all we also evidently get a live corrected feed of at least some subset of the sensory surface itself as well... Not that transformer systems would lack that either.
I see no mechanical reason to think that they lack some capability of "understanding", seeing that how humans accomplish "understanding" is very similar... But rather than having evolution beat those structures into us over eons of trial and error, the vocabulary structures were beaten into it by generations of gradient descent processes applied liberally on token streams until vector representations precipitated, and they are given things like tokenizers and CLIP models for extracting and packing vector representations to and from tokens or features.
Understanding happens in the transformations on the underlying vector representations of the tokens or concepts or whatever you want to call it. That's where it's going to happen, if anywhere, by having a model which forces response to conform to some logic on the input space rather than a lookup table. That's what innate understanding is, the same way a dog knows how to catch a ball despite the fact that knowing where to go involves some manner of calculus. Explicit understanding is just when you can describe in mathematical terms why...
But we don't expect humans to have explicit understanding in the first place to grant them "understanding". Explicit understanding even among humans is rare.
We can even see them reason it out, count out the right number of Rs in Strawberry, and then just like a human fall back on some wrong understanding and completely disregarding the evidence of their actions and output of their reasoning in favor of the thing they "feel" is most correct (I love seeing ChatGPT transcripts where it's asked to count Rs in Strawberry, gets the right answer through careful process and shown work, and then throws the 3 away for the "intuition" it has that there are 2).