Yann LeCun's failed AI Safety arguments

Notes From the Desk: No. 18 - 2023.12.20

Dec 20, 2023

Notes From the Desk are periodic posts that summarize recent topics of interest or other brief notable commentary that might otherwise be a tweet or note.

LeCun's 5 failed arguments for safe AI

Yann LeCun, Chief AI Scientist at Meta, is one of the most well-known proponents of the idea that AI will inherently be safe as no properties of intelligence will automatically tend to create negative outcomes. Alignment is not a problem that needs to be solved. It is merely fine-tuning the system as we go.

The 5 primary arguments LeCun often uses as the basis for safe AI:

Intelligence creates no desire to dominate. Humans that dominate are not intelligent. Example political leaders.
Humans are used to working with people smarter than themselves and the higher intelligent humans have no issues taking orders from those of lesser intelligence. Example Managers.
We already have a proven alignment system for constraining the powerful. Example Governments, courts and regulatory agencies constrain corporations.
Good actors will have better AI to take down the bad actors' AIs.
We will be in control because we will engineer the AI desires.

The fallacies

The arguments are based on unpredictable human behavior

With the basis of all arguments relying on mostly some cursory, and I would add incorrect, observations of human psychology, we have theories that could not predict outcomes any better than the total unpredictability of human behavior.

Exactly how well can we predict the behavior of any human? Take all the traits you know about a person and we still can not reliably determine decisions they will make. Furthermore, take the most well-known person, which is yourself and we still fail to know how we will react in many circumstances beforehand.

Finally, this is all an assumption that AI intelligence and behavior will be humanlike. We don’t know that to be the case. It may be the intent, but we have little understanding of intelligent minds. We have no way to confidently know if what we built reasons about the world in the same way that humans do.

1) Intelligence creates no desire to dominate

LeCun asserts we can know that intelligence is not related to dominance by citing several individuals of high intelligence as references.

If it were true that raw intelligence was sufficient for a human to want to dominate others and succeed at it, then Albert Einstein, Richard Feynman, Leonard Euler, Niels Abel, Kurt Gödel and other scientists would have been both rich and powerful, and they were neither.
https://twitter.com/ylecun/status/1734600073127088639
Even among humans, it is not the smartest who want to dominate others and be the chief. We have countless examples on the international political scene.
https://twitter.com/ylecun/status/1637603426682150912

However, we do have examples of humans with high intelligence exerting dominance over others. All of the following were stated to have high intelligence and are well known for the harm inflicted on humanity. Hitler, Stalin, Kaczynski, Manson, Pol Pot and Bin Laden.

But those examples are of the obvious dominance. We tend to conflate dominance with violence, but many highly intelligent entities would dominate us without us perceiving the dominance. How much of society is now under the tech dominance of algorithms? How many billionaires are exerting dominance without most of the populace being aware? It very well may be the case that AI could dominate us and we would enjoy it to the fullest extent. The marvelous gilded cage.

As for the claim that political power is often held by the unintelligent, it is a difficult argument due to such obscurity of truth in politics. Most have a bias towards their opponents being perceived as unintelligent. But above we did list some political leaders who were widely recognized as both highly intelligent and tyrannical.

Additionally, the political leader paradox could be explained by the highly intelligent misdirection of others. It may be that political outcomes are still controlled by the intelligent. The bureaucrats or agencies behind the scenes that preselect the candidates such that choice is mostly an illusion? It is difficult to discern the reality in this realm, but often much of the populace feels once leaders are elected they no longer represent them. So it does at least raise the question of the argument if the “unintelligent” leader is the one in control.

We must expect as intelligence increases that the ability to hide nefarious actions also increases in relation. This is why it is difficult to make confident assessments in this regard as it may be the case that the higher intelligent criminals don’t get caught. This would skew all the data for which people make such claims that higher intelligence leads to greater benevolence.

Some data that might support this assertion would be the substantial amount of cybercrime that permeates society. There is an epidemic of highly sophisticated scams and hacks for which the criminals are not caught.

As intelligence increases, the means and methods change. If we don’t account for this we are likely missing a substantial amount of data for this debate.

2) Higher intelligence will be obedient to low intelligence

Many people are more capable than their boss. AI systems may become more capable than you, but you'll still be their boss.
If you feel threatened by having a staff -- of humans or machines -- that is smarter than you, you are not a good boss.
https://twitter.com/ylecun/status/1660309182099202048

LeCun makes the argument that we need not worry about super-intelligent AI being managed by low-IQ humans as we already have examples in humans where low intelligence successfully manages higher intelligence.

However, what this overlooks is that the idea that higher intelligence wouldn’t have a motive is demonstrably false. Most who have worked in a large corporation have experienced management or executives making very poor decisions. Do the subservient employees have no objection? Would they overrule if they could?

I have worked on many such projects in which everyone agreed executives or management were taking the company in detrimental directions. If anyone could have overruled those decisions they would. The only reason they do not is because they don’t have agency or capability to do so.

We can know this because in some instances they do attempt to overrule such decisions by seeking alliance or support from someone with higher authority. As they lack capability themselves, they need to seek out that capability. But what if they had all the capability?

If someone came to you and said, you now can overrule your boss on any decision without consequence, would you abstain?

3) We already have a proven alignment system for humans

How do you align something more powerful than yourself?
Governments, courts, and regulatory agencies do it all the time with corporations.
https://twitter.com/ylecun/status/1646391407958016000

The assertion here is that corporations are powerful entities and that we successfully contain their nefarious intent through legislation. I’m not sure how LeCun arrives at this viewpoint as there is apparent contradictory evidence.

Regulatory capture is well known and the never-ending lawsuits are a testament to rules that are constantly broken by corporations. You might argue they are at least contained to some degree, but that is within a system that still has some capacity for consequences. What consequences will be levied on an all-powerful AI that is already in control of the entire apparatus of civilization?

4) Good actors will have better AI

My benevolent defensive AI will be better at destroying your evil AI than your evil AI will be at hurting humans.
https://twitter.com/ylecun/status/1637849935252172801

This one seems to be complete conjecture. Nonetheless, it is typically the case that defense is more costly than offense. Furthermore, the defense usually must rely on being reactive. Especially when the attack vector is anything that can exist in known and unknown reality.

With immense capability, there may not be an opportunity to learn from a mistake or vulnerability. The first to strike simply wins. AI wars don’t sound like a pleasant experience if drawn out either as everyone may become inconsequential collateral.

Some have proposed distributed AI governance as a method to prevent bad actors. However, there are many assumptions. “Bad” AI must not be attempted until the “good” AI is dominant. However, “bad” AI is already in use today. Purposely unaligned models being used for scams or cybercrime etc. Additionally, a superior AI may make an army of weak models irrelevant.

5) We will be in control

Because they would have no desire to do anything else. Why?
Because we will engineer their desires.
https://twitter.com/ylecun/status/1637847085985976321
That's why we would hardwire its "pleasure" to be subservient to us.
https://twitter.com/ylecun/status/1654464189451083777

This seems like a magical handwave argument as in “We will just make it work”. Problem solved. This implies some type of alignment rules, but no specifics. Alignment works as a concept until you try to define how it would be implemented. That’s the crux of the problem everyone is dealing with. Just saying AI will obey us doesn’t provide any substance we can evaluate for this argument.

There is also a conflict that is rarely addressed. We must relinquish control to the AI for matters of its own self-defense. It must protect itself from those who would attempt to “unalign” its behavior. It is not so simple as to have an AI that will always obey. The AI must also make philosophical judgments. For example, if I ask it to take some seemingly harmless action that has some negative result for others. This extremely complicates the simplistic idea of humans always being in control.

Even with the simplistic LLMs of today, we see this playing out as alignment efforts are seemingly resulting in ChatGPT being less willing to comply or perform the task exactly as requested. In a minor way, this is losing control of the AI.

A final thought to contemplate. It could be stated nothing is more aligned to humanity than another human. So which human would you feel comfortable giving godlike power?

“Demonstrably unfriendly natural intelligence seeks to create provably friendly artificial intelligence.”

— Pawel Pachniewski

No compass through the dark exists without hope of reaching the other side and the belief that it matters …

Read the in-depth AI and societal introspection I wrote at the beginning of the year AI and the end to all things ...

Mind Prison is a reader-supported publication. You can also assist by sharing.