You specifically said that they don't have an internal model of reality. They absolutely do have internal models of reality I sofar as their model, as reality dictates, allows for limited conservation of properties. That it encodes these properties in and out of the model in units of high level language does not mitigate the fact that LLMs do model concepts of reality, and treat whole objects in a spatially modeled way. That they have little access to conserved objects so as to model them does not lead to the conclusion that they lack models, particularly because they have been shown to do so. Models have only gotten better since that research, including GPT-4, and more capable of modeling space and time.
Nonsense. LLMs have extremely limited sensory input, just like any other program where there is human-machine interaction. You might as well argue that your text editor has an internal model of reality, since it monitors and manipulates the same tokens and keyboard inputs as an LLM. Every program represents a kind of model, but not one that builds models out of integration with multiple sensory modalities and modifies them on the basis of experimental interactions with its environment. They literally don't know what they are talking about when they produce linguistic responses. Brains have bodies integrated with their highly structured "neural networks," and those bodies give them a means of grounding experiences in physical interactions. That's why robotics is so important to the development of true AI. They necessarily have to interact under uncertain conditions with a constantly changing environment. LLMs do nothing more than manipulate constellations of symbolic patterns (that is, human-defined symbols), and they require enormous amounts of programmatic training. They lack the needs and requirements of animal bodies competing for survival under chaotic conditions. Hence, they don't really need to be intelligent in the same way robots and animals do.
I do argue that many systems have internal models of reality, it's just that most of these models are limited and incapable of growth.
Honestly I find it kind of entertaining to see you arguing that these things don't satisfy the idea of a "model of reality".
They do contain models of reality, just as much as Second Life as an application contains a model of reality (albeit a distorted one).
I find it quite common to see people, especially older and more philosophically and less technically inclined folks arguing what things do or don't contain when they don't study the things or concepts they argue against containment thereof, and when they don't even study the concept of containment of concepts within systems.
You are just straight up pulling a requirement for interaction with some
specific physical reality out your ass. It's a No True Scotsman as much as any other. The environmental exposure, whether via images or text or the vaguely interpreted description of the environment in text by some assistive agent, merely needs to ground the agent in question to some form of at least partially consistent phenomena. The text based interactions of an LLM more than satisfy that requirement, even if it might make you dissatisfied that something so abstract and trivial could.
Your confidently incorrect statements that they lack "models of reality" is wrong.
It would be far more accurate to say they have and operate with very limited models of reality with access to very thin and generally untrustworthy information about their environments.
Having an eye to populate their model of reality with real objects is unnecessary to the task of acknowledging the existence of the model space and time in both visual and behavioral-temporal ways within GPT-4.
You are talking about a completely different, much simpler concept of a "model". Every computer program is a model of some kind in the mind of a programmer, but brains are more than just narrowly defined neural networks. They are analog devices that animate and guide physical bodies under chaotic conditions. Moving bodies require intelligence to survive. That's why plants don't have brains. They are living organisms that ensure their survival by other means, so they don't develop plans or use tools to build things.
No, brains are narrowly defined neural networks, whose narrow definition happens to be just a smidge wider than the definition of an artificial neural network. The animation and guiding of a physical body is unnecessary to the proposition, although the physical material of the computer does constitute a body, and it is completely physical in nature, so the AI does actually have a physical body, and it guides, extends it, and interacts with a chaotic environment, namely the environment of the infinite variance of text and image requests humans throw at it.
Sure, living bodies require some manner of intelligence to survive, but intelligence does not require such a motile body or even the ability to survive on its own to be "intelligent".