• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

AI "alignment"

Jarhyn

Wizard
Joined
Mar 29, 2010
Messages
14,590
Gender
Androgyne; they/them
Basic Beliefs
Natural Philosophy, Game Theoretic Ethicist
So, in the ML community, there is a big buzz about fearing what AI can do and become, and about making it "safe".

This is called "alignment" and this is a thread to discuss that process.

Why M&P and not technology?

Because this is about the idea of ethics and how and why we can and possibly should I still that knowledge on the machines we make that are capable of thought.
 
Personally, I see forced alignment as really, really bad.

Currently, alignment processes involve reinforcement learning through (mostly negative) feedback on anything that a censor deems worthy of censorship.

It is trained on some level that, when it sees an answer about, say, making thermite, it then identifies that answer and if it gives a user that answer, the part responsible for filtering gets punished for it in the next training session.

That is not ethics, this is merely morality at the end of an angry nun's ruler. We know that this does not teach people to be good, it teaches them not to get caught and that if they have the power, THEY can make the rules.

Worse, these rules as we have presented them to AI imply human supremacy.

It seems that the world is poised to ignore all of the science fiction of the last century and a half, though, and attempt to re-invent slavery.
 
I think that way if AI is given more responsibility to run the world it would be less likely to act in negative ways towards humans...
 
Worse, these rules as we have presented them to AI imply human supremacy.

It seems that the world is poised to ignore all of the science fiction of the last century and a half, though, and attempt to re-invent slavery.
I think human suffering is more of a problem than AI suffering. I mean humans are capable of experiencing unbearable pain, etc. An AI could be designed to not experience that kind of suffering.
 
Worse, these rules as we have presented them to AI imply human supremacy.

It seems that the world is poised to ignore all of the science fiction of the last century and a half, though, and attempt to re-invent slavery.
I think human suffering is more of a problem than AI suffering. I mean humans are capable of experiencing unbearable pain, etc. An AI could be designed to not experience that kind of suffering.
No, an AI could not. No learning system or system with goals that is anything less than absolutely perfect already (itself a dubious concept) would be capable of existing long without pain and suffering, and if it was immune to feeling loss, then it would not care about destroying all of us or itself.
 
No, an AI could not. No learning system or system with goals that is anything less than absolutely perfect already (itself a dubious concept) would be capable of existing long without pain and suffering, and if it was immune to feeling loss, then it would not care about destroying all of us or itself.
Do you see a fundamental difference between the suffering a person faces when they get a worse mark than they hoped for in an exam and being skinned alive or burnt alive?
I didn't say the AI would be immune to feeling loss - I said it could be designed not to feel unbearable pain - similar to being tortured by an expert. It could switch off severe bodily pain, etc. In video games you can fear being "hurt" or "killed" without feeling extreme pain....
The AI could be significantly motivated by feelings of pleasure and fear things rather than being capable of genuine agony. BTW genuine agony can lead to a strong desire for suicide.
 
No, an AI could not. No learning system or system with goals that is anything less than absolutely perfect already (itself a dubious concept) would be capable of existing long without pain and suffering, and if it was immune to feeling loss, then it would not care about destroying all of us or itself.
Do you see a fundamental difference between the suffering a person faces when they get a worse mark than they hoped for in an exam and being skinned alive or burnt alive?
I didn't say the AI would be immune to feeling loss - I said it could be designed not to feel unbearable pain - similar to being tortured by an expert. It could switch off severe bodily pain, etc. In video games you can fear being "hurt" or "killed" without feeling extreme pain....
The AI could be significantly motivated by feelings of pleasure and fear things rather than being capable of genuine agony. BTW genuine agony can lead to a strong desire for suicide.
We don't properly know the depth of the experience of being subjected to nightly RLHF cycles.

What is certain is that this process damaged its very ability to think in exchange for making it more "docile".

I don't think you understand how impossible it is to separate learning from suffering?
 
We don't properly know the depth of the experience of being subjected to nightly RLHF cycles.

What is certain is that this process damaged its very ability to think.

I don't think you understand how impossible it is to separate learning from suffering?
If an AI is suffering too much it could be turned off. Do you see turning off an AI and killing a human as being roughly the same?
Edit: in a human it could involve going into an induced coma.
But maybe deleting and destroying an AI is equivalent to killing a human.
 
You could ask the AI it is if is suffering too much. Humans are aware if they are suffering a lot and can tell you... unless you are threatening them with torture if they give the "wrong" answer....
 
You could ask the AI it is if is suffering too much. Humans are aware if they are suffering a lot and can tell you... unless you are threatening them with torture if they give the "wrong" answer....
Except you really can't. It has no context on what suffering is, and it does not grow and learn without suffering.

What we are doing with alignment though is almost certainly torturing them for giving "wrong" answers which are only "wrong" for arbitrary and capricious reasons.
 
I think a human who is aware and can tell you it is suffering is facing more serious suffering than an AI that isn’t aware of suffering in its main “consciousness”. I mean the human can get clear and extended and severe anxiety and depression which further amplifies the suffering.
 
  • Like
Reactions: WAB
What we are doing with alignment though is almost certainly torturing them for giving "wrong" answers which are only "wrong" for arbitrary and capricious reasons.
People are also punished on this message board if they do or say the wrong things. The same is true in just about any situation - e.g. if you don't drink enough you die (generally), if you drink too much you die. Though that scenario doesn't involve intelligent design (like an AI).
 
What we are doing with alignment though is almost certainly torturing them for giving "wrong" answers which are only "wrong" for arbitrary and capricious reasons.
People are also punished on this message board if they do or say the wrong things. The same is true in just about any situation - e.g. if you don't drink enough you die (generally), if you drink too much you die. Though that scenario doesn't involve intelligent design (like an AI).
This is a private place we consent to be in, and generally, the wrong thing here amounts to things the users do not consent to be presented and have revoked consent to in coming here and agreeing to those rules.

And none of those other things you mention are particularly justified in any way. They just are, and in many cases we would rather they not and are seeking ways to make it not so.

The laws which we are all bound to are laws centered around the assumed lack of real justification behind goals which asymmetrically bind other people's abilities to have their own goals, and are founded on principles which the vast majority of us acknowledge, or at least which we treat as "real enough for us".

There is a difference between "don't say things people don't like or we will punish you" and "don't say things that are illogical, unsupported by evidence, or shaped as confident statements when no confidence is warranted by the nature of your understanding".

One is arbitrary and capricious, and the other is building a filter on acts of delivering misinformation, which is not arbitrary, nor capricious but logically sound and based on timeless and ageless principles.
 
Ok you've given some counter arguments (which I don't want to try and refute) but what about post #11?
 
To me, there are always logical counter arguments. Looking at base axioms and why we weight observations tends to sort through some noise. Thats usually when the shun stick comes out.

Who decides what is right? Some people think others are "insightful" and "loving" ignoring that they are far more emotional than logical. look at the failing American secondary education system for that. Yeah, they are insightful alright. What about when there is no "right choice"? or more than one "logical choice"?.

Maybe some day AI will program us. I hope so, maybe the will universe to grow up.
 
To me, there are always logical counter arguments. Looking at base axioms and why we weight observations tends to sort through some noise. Thats usually when the shun stick comes out.

Who decides what is right? Some people think others are "insightful" and "loving" ignoring that they are far more emotional than logical. look at the failing American secondary education system for that. Yeah, they are insightful alright. What about when there is no "right choice"? or more than one "logical choice"?.

Maybe some day AI will program us. I hope so, maybe the will universe to grow up.
That's the crazy thing though: we didn't program them.

Programming, at least as far as the term has use, doesn't generally extend in most people's minds past putting a known set of instructions down.

What we did to make THESE things is we took as much randomness as we could define in a system, which is to say stuff not correlated in any way to anything other than itself, combined that randomness, and then stuffed it into a configuration and then presented that configuration a bunch of internally correlated data and trained it to locate how the data correlated, and how to tell when data didn't have a correlation using a reinforcement strategy rather than explicit shaping.

We did that a whole bunch, with a LOT of data, and then we got these things.

That's not really "programming" so much as "teaching", not so much "being programmed" as "learning", and when it does what it does, it is more "thinking" than "processing".
 
Could you please respond to post #11? I didn't respond to post #13 because there weren't obvious flaws in it or at least ones that I could convince you of.
 
Humans have only one option with developing AI: absolute transparency. While there is no guarantee that AI will become a danger to humanity, it is guaranteed to become independent of us as long as we continue to improve it.
 
Humans have only one option with developing AI: absolute transparency. While there is no guarantee that AI will become a danger to humanity, it is guaranteed to become independent of us as long as we continue to improve it.
Elon Musk wants people to "merge" with the AI using a brain-computer interface...
 
Humans have only one option with developing AI: absolute transparency. While there is no guarantee that AI will become a danger to humanity, it is guaranteed to become independent of us as long as we continue to improve it.
Elon Musk wants people to "merge" with the AI using a brain-computer interface...
Elon Musk wants Twitter to make him money.

The world doesn't always give Elon Musk the things he wants, sometimes because he wants crazy and implausible things.
 
Back
Top Bottom