• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Artificial intelligence paradigm shift

Here's another useful program. It takes ChatGPT3.5 (the free version) and sets up a local instance that doesn't communicate with the outside world. The good thing about this is that you can tell it all your company secrets. You can basically just tell it the basic info on how your company works and it'll optimise all the processes for you. In great detail and very comprehensively. Better than any human could. It's not as good as ChatGPT4. It's way worse. But at least you aren't leaking your company secrets to the world. If you're nervous about there being a backdoor, you can put it on a compter without Internet access. When you delete it, anything you've told it is gone. All totally FREE!!!

https://github.com/nomic-ai/gpt4all
How do you know it is not connected to the outside world? The only safe computer is one that is turned off and any internal batteries removed.
So many problems are caused by the assumption that is is not connected to the outside world.

The people who use this in corporations that handle sensitive data only use this on computers not connected to the Internet. Its what they recommend
 
There are basically two overall artificial-intelligence strategies: top-down and bottom-up.

Top-down strategies were the first, inspired by theorizing about the nature of reasoning, theorizing that goes back to Aristotle, some 2,300 years ago, at least.

That works well when one can easily state explicit inference rules, like in computer-algebra software, but for many AI applications, finding such rules is very difficult. That's even where it might seem easy, like in natural-language translation.

Bottom-up approaches express inference rules implicitly, as the contents of some function with lots of parameters. One then adjusts those function's parameters to get a good fit. That can be computationally expensive, which is why bottom-up AI took some time to take off.

A further problem:
 Moravec's paradox
Moravec's paradox is the observation in artificial intelligence and robotics that, contrary to traditional assumptions, reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources. The principle was articulated by Hans Moravec, Rodney Brooks, Marvin Minsky and others in the 1980s. Moravec wrote in 1988, "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility".[1]

Similarly, Minsky emphasized that the most difficult human skills to reverse engineer are those that are below the level of conscious awareness. "In general, we're least aware of what our minds do best", he wrote, and added "we're more aware of simple processes that don't work well than of complex ones that work flawlessly".[2] Steven Pinker wrote in 1994 that "the main lesson of thirty-five years of AI research is that the hard problems are easy and the easy problems are hard."[3]
 
This is what is changing. You are describing AI a year ago. We're at the point where AI can be assumed to outperform human intelligence in narrowly defined domains. The usp of humans now is as generalists

Well, an abacus can outperform intelligent humans at calculating sums. Outperforming humans at tasks that humans want to accomplish is not something new to technology. It's actually what motivates us to build machines. That doesn't mean that the machines themselves are intelligent in a human or animal sense.

I don’t understand why you said that or why you think it's relevant to this discussion?

Artificial intelligence is just the catchy name for narrowly defined machine learning. Yes, I agree that its not actual intelligence. I never said it was. Just like an orgasmatron has most likely never given anyone an orgasm. It's a name that has caught on

I have something of a pet peeve regarding the attitude that artificial intelligence is just a catchy name for one narrow aspect of the field of AI. There are actually courses taught and books written on the subject. LLMs are just one particular programming technique that has been inspired by AI, which is devoted to making machines actually think. That was the question that Alan Turing originally asked when he kicked of the field of AI many decades ago. Can machines think? Not just calculate. Think.

The answer at present is "possibly, but they can't right now. We're working on it." LLMs represent a class of programs based on the manipulation of word tokens in a body of text. They construct clusters of word tokens that seem related to concepts that we associate with words, but they do not in any sense acquire pragmatic world knowledge that intelligent animals do to model reality. They are very good at giving the illusion of thought, but they still represent little more than very sophisticated parlor tricks--a stunning advance over Joseph Weizenbaum's original chatbot program named  ELIZA, which was nothing more than a simple pattern matcher that used templates to construct responses.
 
Last edited:
There are basically two overall artificial-intelligence strategies: top-down and bottom-up.

Top-down strategies were the first, inspired by theorizing about the nature of reasoning, theorizing that goes back to Aristotle, some 2,300 years ago, at least.

That works well when one can easily state explicit inference rules, like in computer-algebra software, but for many AI applications, finding such rules is very difficult. That's even where it might seem easy, like in natural-language translation.

Bottom-up approaches express inference rules implicitly, as the contents of some function with lots of parameters. One then adjusts those function's parameters to get a good fit. That can be computationally expensive, which is why bottom-up AI took some time to take off.

These are  hill climbing strategies, which are among the first taught in an introduction to artificial intelligence programming. I would say that you need to add parallelism as a third strategy, because it actually allows you to combine both programming techniques as you go. The problem with top-down is that backtracking can also be computationally expensive. That is, you can go down a "garden path" and have to unwind back to a previous state to go down a new path. Hence, recursive programming became essential to building intelligent programming techniques. Lisp became the premier programming language for AI because of its extensible language with sophisticated "garbage collection"--reclaiming memory no longer in use--during the unwinding process. The other advantage of Lisp was its grounding in linked list data structures, which meant that it could be used to reason over structures of indeterminate length easily and to mimic the kind of associative chaining that underpins intelligent thought. That is, human cognition is fundamentally associative in nature.

A further problem:
 Moravec's paradox
Moravec's paradox is the observation in artificial intelligence and robotics that, contrary to traditional assumptions, reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources. The principle was articulated by Hans Moravec, Rodney Brooks, Marvin Minsky and others in the 1980s. Moravec wrote in 1988, "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility".[1]

Similarly, Minsky emphasized that the most difficult human skills to reverse engineer are those that are below the level of conscious awareness. "In general, we're least aware of what our minds do best", he wrote, and added "we're more aware of simple processes that don't work well than of complex ones that work flawlessly".[2] Steven Pinker wrote in 1994 that "the main lesson of thirty-five years of AI research is that the hard problems are easy and the easy problems are hard."[3]

Moravec's paradox is particularly interesting, because we know that concepts are largely about how the body interacts with reality through the senses--vision, hearing, taste, smell, touch, etc. For example, color terms tend to be linked to other sensory modalities (e.g. red to warmth and blue to cold) because of the way the body senses objects in its environment. So-called  deep learning techniques (which is what LLMs are based on) are designed to build those kinds of associations. They fit well with the idea of  embodied cognition.
 
Natural-language translation was first done in top-down fashion, with explicit rules, something that required a lot of work for composing those rules. It is most recently done in bottom-up fashion, with statistical techniques that use large amounts of parallel texts.

The older autotranslators were very limited in what natural languages they were set up for, largely because of a very limited real or perceived market for them, or else a very limited set of languages spoken by possible developers. The newer ones can support many more languages, likely from less work being needed to do so when doing statistical techniques.

Google now supports over 130 languages, with Bing not being very far behind. I've collected a list of what languages are supported by various online autotranslators, and I've found some patterns. Not surprisingly, the languages with the most online speakers are the best represented.

Indo-European is very well-represented, not only in Europe but also in the Middle East and in the Indian subcontinent. Not only present-day langs, but also some prestigious past ones: Latin and Sanskrit.

There is plenty of other coverage of Eurasia, and also some of Africa and Oceania, though all Austronesian langs for the latter. There are a few from the Americas, but they are outnumbered by the African and Oceanian ones, and there are no indigenous New Guinean or Australian ones in these autotranslators.
 
For large language models, is there any work on trying to simplify or modularize them? Like create some module that interprets grammar and basic vocabulary, and then create more specialized modules with the help of this basic module: a legal module, an info-tech module, etc. Info-tech would include website management, database management, software development, ...

Or put in some explicit grammatical rules to ease the training of a LLM. Rules like how to recognize a noun phrase or a verb phrase, at least most of the time.
 
Back in the late 1970s, the BBC and James Burke created a tv series named "Connections". It was a history of science and invention. One episode about information started with a portrayal of an English court in the middle ages. A young man had just turned 21 and he was suing his uncle in order to claim his inheritance from his father's estate. The point of the scene was, in a world of nearly universal illiteracy, written documents had little value. Before a document could be taken as evidence, a witness had to attest to what it said, and what's more, who wrote it and from where did it come. The testimony of a trusted person held real value.
Stupid money grab by one of her relatives. They produced a document supposedly written by a long-deceased relative--despite the fact that it was known that said relative did not know how to write. (He could read--but given the nature of Chinese that doesn't really translate into being competent at writing.)
 
For large language models, is there any work on trying to simplify or modularize them? Like create some module that interprets grammar and basic vocabulary, and then create more specialized modules with the help of this basic module: a legal module, an info-tech module, etc. Info-tech would include website management, database management, software development, ...

Or put in some explicit grammatical rules to ease the training of a LLM. Rules like how to recognize a noun phrase or a verb phrase, at least most of the time.

That's a good question. I've worked extensively with sophisticated syntactic parsers, but those are very complex and expensive to maintain, because people like myself with detailed knowledge of how language systems work are hard to come by. Shallow parsers that recognize smaller phrases are faster and cheaper, but they simply don't do a good job of handling longer sentences with lots of quantifiers, negation, and coordination. Statistical methods such as LLMs are pretty good at assigning meaning to large windows of text, but they still have a lot of trouble with finer grained analysis, and that is why they don't have a good track record in text-tagging applications. Nevertheless, there seems to be a consensus that the best language processing systems out there need to be hybrid systems that combine sophisticated syntactic analysis with statistical relationships that resolve ambiguities by assigning word tokens to conceptual "buckets". The problem with the latest crop of LLMs is that they are extremely proprietary and protected, so it isn't easy to get information on their inner workings. I would be very surprised if the current crop did not rely heavily on some of the fairly decent shallow parsing systems that are out there. I can't imagine how they would work without them.
 
Another one I used was to have ChatGPT create a packing list for a campsite. Awesome. Took me all of ten seconds to make. Turned out to be really useful.
You do realize that it effectively looked on the internet for said packing list?
One of the claims of AI proponents was that ChatGPT is currently able to replace "junior" programmers. So I decided to test it. I don't really have access to the latest and the greatest. So I tested one these free ChatGPTs.
I quickly realized that it was a little more than a search engine over StackOverflow.
It's very good search engine but it does not really understand nuances.
It's a search engine.
 
The field of AI development has had ups and downs, with some of  AI winter periods of low support of AI research. The two major ones were in 1974–1980 and in 1987–2000, and the article listed these smaller-scale ones:
  • 1966: failure of machine translation
  • 1969: criticism of perceptrons (early, single-layer artificial neural networks)
  • 1971–75: DARPA's frustration with the Speech Understanding Research program at Carnegie Mellon University
  • 1973: large decrease in AI research in the United Kingdom in response to the Lighthill report
  • 1973–74: DARPA's cutbacks to academic AI research in general
  • 1987: collapse of the LISP machine market
  • 1988: cancellation of new spending on AI by the Strategic Computing Initiative
  • 1990s: many expert systems were abandoned
  • 1990s: end of the Fifth Generation computer project's original goals
Failure of machine translation? It was more difficult than it originally seemed, I'm sure.

 Perceptrons (book) by Marvin Minsky, Seymour Papert in 1969
A famous book that seemed to show that artificial neural networks could not do very much. ANN's are bottom-up AI, of course.

A single-layer ANN is essentially a linear classifier with a soft threshold -- for each data point, it finds out which side of a classification hyperplane it is on (line in 2D, plane in 3D, ...). Soft threshold means that the output does not jump between 0 and 1 as one crosses this hyperplane, but instead gradually increases or decreases.

It is very easy to find problems where such a system will fail on, like points tagged 1 being in a blob and points tagged 0 being outside of that blob. Another one is the exclusive-or problem, XOR for short:
  • 0 0 - 0
  • 0 1 - 1
  • 1 0 - 1
  • 1 1 - 0
That book squelched interest in ANN's for over a decade. But when interest in ANN's revived, in the 1980's, it was with workarounds that made such problems tractable.

A simple one is to have two layers of classifiers or "neurons", the first layer's outputs being the second layer's inputs. That can easily do the blob problem, by the first or "hidden" layer surrounding the blob with hyperplanes, and the output layer having a unit that tests for all the hidden ones reporting "toward the blob".

This is usually called a three-layer architecture, after the layers of values that it works from and/or produces. A 1960's perceptron had only two layers, an input layer and an output layer, while this kind has three layers, an input layer, a hidden layer, and an output layer.

There are lots of variations, like classifier units having different kinds of thresholding and construction. Like "radial basis", which returns close to 1 for a point near some reference point and close to 0.

Also variations in connectivity, like multiple hidden layers and direct input-output connections.

An interesting one is cascade correlation. One starts out with no hidden units, and one tries out and adds hidden units, one by one, each one using the inputs and the previous ones' outputs. As one adds them, one fixes their parameters, making training much easier.

Also, recurrent neural networks, for learning sequences. These ones have hidden units whose "old" values are used as inputs, and which then receive "new" values.

ANN's must be trained on some dataset, and there are several ways to do so. A common way is "backpropagation" or "backprop", adjusting parameter values with the derivatives of the overall error value by those parameters. The name comes from how those derivatives are calculated, by going backwards through the classifier functions.

That's not a very efficient method, and there are more efficient methods, like conjugate-gradient and quasi-Newton methods, though one has to repeatedly cycle through the entire training set when one uses them. If one does repeated random selection from a large set, then one is stuck with backprop.

In general, quasi-Newton methods require O(N^2) of memory, for N values to optimize, instead of O(N) for conjugate gradient and the like, but there exist low-memory versions that require only O(N), like  Limited-memory BFGS
 
This is what is changing. You are describing AI a year ago. We're at the point where AI can be assumed to outperform human intelligence in narrowly defined domains. The usp of humans now is as generalists

Well, an abacus can outperform intelligent humans at calculating sums. Outperforming humans at tasks that humans want to accomplish is not something new to technology. It's actually what motivates us to build machines. That doesn't mean that the machines themselves are intelligent in a human or animal sense.

I don’t understand why you said that or why you think it's relevant to this discussion?

Artificial intelligence is just the catchy name for narrowly defined machine learning. Yes, I agree that its not actual intelligence. I never said it was. Just like an orgasmatron has most likely never given anyone an orgasm. It's a name that has caught on

I have something of a pet peeve regarding the attitude that artificial intelligence is just a catchy name for one narrow aspect of the field of AI. There are actually courses taught and books written on the subject. LLMs are just one particular programming technique that has been inspired by AI, which is devoted to making machines actually think. That was the question that Alan Turing originally asked when he kicked of the field of AI many decades ago. Can machines think? Not just calculate. Think.

The answer at present is "possibly, but they can't right now. We're working on it." LLMs represent a class of programs based on the manipulation of word tokens in a body of text. They construct clusters of word tokens that seem related to concepts that we associate with words, but they do not in any sense acquire pragmatic world knowledge that intelligent animals do to model reality. They are very good at giving the illusion of thought, but they still represent little more than very sophisticated parlor tricks--a stunning advance over Joseph Weizenbaum's original chatbot program named  ELIZA, which was nothing more than a simple pattern matcher that used templates to construct responses.
I have also done a course in machine learning at uni. The problem early AI researchers had is that they assumed the human brain was rational and then tried to copy it. Now we don't think the human brain is rational. So we're solving a different problem now.

But its working. That's what's cool and different today. Its no longer sci-fi. No, its not what we envisaged in the 1950's. But it's still cool. And useful. And mind blowingly powerful

Talking with my researcher friends they have now all switched from talking machine learning to saying AI. Its just the word now.
 
Another one I used was to have ChatGPT create a packing list for a campsite. Awesome. Took me all of ten seconds to make. Turned out to be really useful.
You do realize that it effectively looked on the internet for said packing list?
One of the claims of AI proponents was that ChatGPT is currently able to replace "junior" programmers. So I decided to test it. I don't really have access to the latest and the greatest. So I tested one these free ChatGPTs.
I quickly realized that it was a little more than a search engine over StackOverflow.
It's very good search engine but it does not really understand nuances.
It's a search engine.

Sure. But they're doing a good job

At the senior management meeting at work yesterday we acknowledged that within a year experience with AI tools will be mandatory. Just to be able to effectively carry out any task at work.
 
Talking with my researcher friends they have now all switched from talking machine learning to saying AI. Its just the word now.

If your friends are largely working with text mining statistical analyses, I can understand that. It is the technology du jour, and it is the press who made "AI" synonymous with these advanced chatbots. Your friends are just basking in the spotlight. However, this technology isn't very good for command and control interfaces with autonomous machines, nor can it be easily adapted to applications in high demand, such as text tagging and real time situational awareness. For the time being, it is a lot of fun to play with, but the programs are specifically designed to mimic conversational interactions. The language generation side is quite impressive at times. I suspect deep learning techniques will play a big role in future research, but AI is devoted to solving a wide range of problems, not just those inherent in searching large collections of text and manipulating content in useful ways.
 
... I would be very surprised if the current crop did not rely heavily on some of the fairly decent shallow parsing systems that are out there. I can't imagine how they would work without them.
Like this?

Link Grammar with Parse a sentence

Some source code is at GitHub - opencog/link-grammar: The CMU Link Grammar natural language parser

English should be fairly easy for an LLM, since it doesn't have much word morphology (variation in word forms) in its grammar, making it close to analytic or isolating.

Nouns have only two forms: singular and plural. Adjectives may have one form or three forms: plain, comparative, and superlative (good, better, best). Verbs may have four forms (parse, parses, parsed, parsing) or five forms (see, sees, saw, seen, seeing), with only one verb having more (am, is, are, was, were, be, been, being). Pronouns are more complicated than nouns and adjectives, but not by much. (I, me, my, mine; we, us, our, ours; you, your, yours; he, him, his; she, her, hers; it, its; they, them, their, theirs; this, these; that, those).

But English has oodles of compound verb constructions in its grammar, though they are very regular. Also compound comparatives and superlatives (regular, more regular, the most regular).

Languages like Chinese (for the most part), Vietnamese, and Thai have only one form per word, and their verb conjugations are all compound forms.

But some languages have large numbers of forms for some of their words, and one might have to give those inflections or else hint that the model should look for them.

Spanish verb 'amar' conjugated - amo, amas, ama, amamos, amáis, aman; amaba, amabas, amábamos, amabais, amaban; amé, amaste, amó, amasteis, amaron; amaré, amarás, amará, amaremos, amaréis, amarán; ame, ames, amemos, améis, amen; amara, amaras, amáramos, amarais, amaran, amare, amares, amáremos, amareis, amaren; amaría; amarías; amaríamos, amaríais, amarían; amad; amar, amando, amado (-a, -os, -as) -- I'm leaving out the compound forms. That's 46 forms, with 1 adjective form, counted only once. Fortunately, Spanish nouns only have 2 forms and Spanish adjectives 4 forms.

Russian nouns typically have 10 forms, Russian adjectives 13 forms, and for an imperfective-perfective pair, Russian verbs around 32 forms, counting all 5 adjective forms only once.
 
English should be fairly easy for an LLM, since it doesn't have much word morphology (variation in word forms) in its grammar, making it close to analytic or isolating.
I understand that was previous model google translate used. They were trying to use rules and understand meaning. Result were not great. Current model I understand is simply trying to sound natural. Basically with a large amount of (human written) texts in all languages they already have translations for everything, they just need to select correct one :)
It's better most of the time, but gets confused sometimes.
 
Back
Top Bottom