Jahryn, use of overly pedantic technical terms like "tensor" doesn't really help to explain how LLMs actually work, which is why more general descriptions (like the link I posted) tend to avoid such jargon. What you need to realize is that the similarities between brains and neural networks are extremely superficial and are too low level to explain how "the human mind functions and thinks and learns. As others have pointed out, these programs only simulate intelligent behavior. You should be wary of falling victim to the
Clever Hans Phenomenon, which applies to computers as well as horses. The programs do not actually understand language in the same way that human beings do, because the programs do not ground their analyses in human experiences. LLMs, of course, are not just tensor models. They have a number of higher level components that are mentioned in the article I posted, but which (I presume) you did not read. Those components have little to do with the structure of brains or mental functions that brain activity gives rise to.
Are you kidding? Humans don't directly ground all our analysis in human experiences. We ground stunning little that way, and the experiences we have as humans more function only distantly to that end.
Instead of viewing humans as a monolithic "self" it is better to view humans as a bunch of small systems that all work in the same general way: take some thing, convert it to some kind of internal grammar structure, forward the description on, layered over the old description to a other node, add more syntax from another internal grammar structure using the input and the output from the previous layer(s).
Eventually, the input/output chain generates a prompt for "the next system" to do something with it, that next system being the one that strictly constitutes "us".
It's very important to recognize that by the time information winds it's way around to "us", we get prompted with an image and a piece of natural language input overlayed with spoken language input that prompts US for what to do, and we prompt back "do this stuff".
Then we get prompted with information about how that went, in the form of the new current context.
This "find the appropriate response from within all your learned appropriate responses" bit that people say the LLaMa processes are doing may be accomplished by slightly different mechanisms, but the fundamental process of prompting for input, and shaping the next prompt based on that input is the core of it.
The way fully autonomous agents were (note: past tense) built using this technology is kind of fun too: there is one model which has been tasked with some core utility function, in the form of text. That utility function generates prompts for a second node to gather information, or perform some task pursuant to the utility function.
The receiving second instance of the LLaMa interprets this prompt, along with a request to generate additional prompts for sub-tasks.
Those prompts are fed back to the original agent and that agent spawns new nodes. Eventually those nodes produce whatever output is necessary to either actuate a web browser, load an image, and that gets kicked back to the core process and it produces a prompt to act based on that context. It fires off new nodes asking for commands to execute the calculated actions, but the system is set up to funnel that output towards fulfillment systems like command lines!
As this happens, the results are fed into what's called a vector database, a long term low-weight database for building the next day's model deltas, which are applied to the underlying model binary to adjust its weights.
The next day the system spins up again, having dreamed it's dreams of the previous day, so to speak, and then the process starts again.
The biggest, most important thing to note is that the cognitive biases of the system are adjusted over time, generally to reflect the expectations of the system.
Pretending that just because it's experiences are not human, that it's experiences are not GROUNDING, is kind of silly. It's responses are tempered and grounded by actual sensory probing through various collection channels.
Of course, since it's text history includes so much common human experience, these processes are created with little need for adjustment or education.