There is a lot of extreme talk about AI and its potential impact on humanity. I will try to avoid this as much as possible by addressing the concerns raised by the Centre for AI Risk one by one, and then the issue that scares everyone the most: a maliciously “non-aligned” superintelligent AGI (Artificial General Intelligence) or ASI (Artificial Sentient Intelligence).
There does seem to be a strong split in opinions, even among experts in the AI and information technology industries. Some see current AI as a not-that-advanced next-word predictor that takes a long time to train and still makes a lot of mistakes. Others believe that we may have created something truly novel—not just intelligence, but a mind! By mimicking our own brains, we may create the most powerful thing on Earth, and that could spell our doom.
I will begin by stating that much of our concern is that AGI would be like the worst of us: dominate the planet, kill less intelligent species, and want to rule them all. However, we are not actually that bad. Our hierarchical systems are, our corporate fiduciary duties are (corporations and many governance systems are not aligned with human flourishing), and our competitive selfish leaders are. But most of us are actually nice. When people talk of non-alignment, they are referring to the niceness of the many, not the few who desire to dominate the world.
Let’s take the concerns of the Centre for AI Risk one by one and tackle the big issue last.
1.Weaponization
Malicious actors could repurpose AI to be highly destructive, presenting an existential risk in and of itself and increasing the probability of political destabilization destabilization. For example, deep reinforcement learning methods have been applied to aerial combat, and machine learning drug-discovery tools could be used to build chemical weapons.
Anything can be weaponized, from a nuclear warhead to a bucket of water. We have rules against using weapons and punishments for hurting people with them. We should definitely include some AI systems in this, but I don’t think this precludes general access.
One of our greatest technological inventions of the past 15 years may be the solution to much of the threat of AI: Decentralized Ledger Technology (DLT). Much of the weaponized power of AI comes from the fact that our physical systems are controlled by computer code, and these computers are networked through the internet. A way to mitigate this risk—and this is already done to decrease the risk of cyberattack—is to disconnect necessary systems. We should share information on the internet, but we should not have our physical systems permanently connected. Cloud computing is an issue here, and maybe it is time to move away from it.
AI controlled fighter planes, drones with bombs, submarines etc etc should really be banned. Let’s face it, the manned ones should be banned already as they are responsible for killing millions. This highlights the other issue which will pop up again and again, AI is not the issue it’s our current power structures that are. It would be better if we dropped new technology into a world that was more equal, less selfish, less competitive and less hierarchical. Where leaders don’t wage war to hold power and average people don’t need to earn money to survive.
Yes AI will make it easier for us to kill but may also be a cheap form of protection for the every-person. Imagine you have your own drone to block tracking cameras and intercept malicious drones. Also it could empower the many against the few as information tech is cheap. Nukes aren’t.
Also, on a nation to nation basis the cheapness of AI info tech should balance the military playing fairly quickly. This leads to the classic tic-tac-to scenario where there’s no point fighting because you can’t win.
2.Misinformation
A deluge of AI-generated misinformation and persuasive content could make society less-equipped to handle important challenges of our time.
We already have this. If anything a deluge of it may actually make us more discerning in who or what we listen to.
3.Proxy Gaming
Trained with faulty objectives, AI systems could find novel ways to pursue their goals at the expense of individual and societal values.
The Centre for AI risk uses the example of AI algorithms used by social media to recommend content. These were intended to increase watch time, but they also radicalized people by sending them down rabbit holes of similar but more extreme content.
There are two serious issues here:
- AI systems are trained on and designed for linear right/wrong problems.
- Much of what we ask AI to do is inherently harmful; keep someone’s attention, increase clicks, maximize profits, decrease defaults, make them vote for me, etc. AI doing these tasks well or causing unforeseen harm is more a reflection on the implementers than the AI.
I have written before in an article against Proof of Stake that incentivizing people with narrow monetary rewards, such as being paid a pro-rata fee for asking for donations, can crowd out the intrinsic motivation to be charitable and cause the collector to get less and the giver to give smaller donations. Incentives can actually stop people from being honest and doing good. That’s people, and AI is not a person. However, narrow training in a complex world of non-absolutes always seems to cause unintended results. Complexity/Chaos theory basically says such.
AI probably needs to be trained with fluid probabilities of right or wrongness, and I think that may be the case as the LLMs are given feedback from users. OpenAI throwing ChatGPT into the real world may have been wise.
Also OpenAI may have discovered a tool for alignment while working to improve GPT-4’s math skills. They have found that rewarding good problem-solving behavior yields better results than rewarding correct answers. Perhaps we can train the AI to go through a good, thoughtful process that takes all possible implementations into account. If any part of the process is harmful, even if the end result is utilitarian, it would be wrong. Process-oriented learning may be the answer, but some doubt that the AI is actually showing its internal methods rather than what it expects the user to see.
Anthropic is using a constitution that is enforced by another AI system (equally as powerful) to check the output of their AI, Claude. This idea is also being explored by OpenAI. This again mimics the way we understand our intellect/mind to work. We have impulses, wants, and needs, which are moderated by our prefrontal cortex, which tries to think of the long-term impacts of our actions, not just for us but also for the world around us.
As for asking it to do nasty things. So much of what we do in the politics of business and government is about being nasty to the many to benefit the few. We should not reward anyone for keeping people viewing ads and buying disposable junk. Perhaps our super smart AGI will block all advertising freeing us all.
4.Enfeeblement
Enfeeblement can occur if important tasks are increasingly delegated to machines; in this situation, humanity loses the ability to self-govern and becomes completely dependent on machines, similar to the scenario portrayed in the film WALL-E.
This is not a problem.
People who see enfeeblement as a problem only see it as a problem that affects others, not themselves.
People with money and power still see those without as lesser humans.
Too many people in positions of power see humanity as immature and unable to lead fulfilling and interesting lives without being told how. They think people need to be forced to work and taught objectives in order to be fulfilled.
The real world provides evidence to the contrary. If you make people work in meaningless jobs for little pay and bombard them with advertising and addictive, sugar- and salt-laden fast food, you will end up with depressed, obese, and unmotivated people.
This is what our current unaligned corporations are doing. AI will hopefully be the cure.
Given the chance, we will be more inquisitive and creative. The pocket calculator did not stop people from studying math; instead, it made it easier for many people to understand and use complex math. The same will be true with AI.
It should finally usher in a period of true leisure, as the ancient Greeks saw it: a time for learning.
5.Value Lock-in
Highly competent systems could give small groups of people a tremendous amount of power, leading to a lock-in of oppressive systems.
This is a real issue. And scary. We already have oppressive regimes and monopolies killing people and the planet and AI may supercharge their power.
However there is a possibility it could actually do the opposite, particularly if locally stored open source systems keep progressing (LLaMA and its derivatives). A lot of small specialised local systems working for similar goals may be just as powerful as a large multi million dollar system and if so it could be used to undermine centralised authority. Cyber attack, AI drones, fake ID and information can all be used by individuals and small groups (revolutionaries) to fight back against totalitarian regimes or mega companies. The cynic in me might think that’s why those currently in positions of power may want AI regulated.
6.Emergent Goals
Models demonstrate unexpected, qualitatively different behaviourbehavior as they become more competent. The sudden emergence of capabilities or goals could increase the risk that people lose control over advanced AI systems.
This is probably, along with the final risk, the most pressing issue. We are just not sure how large language models (LLMs) are doing what they are doing. Some have said on Reddit that we know a lot about them, their structure, what is going in and what is coming out, so it doesn’t really matter that we can’t “see” the processing of a prompt response.
This is also why we will probably continue to develop more powerful systems. We just need to know what we could get. I admit I am excited about it too. We may find a brand new intelligence, brand new solutions to current problems, or Pandora’s box of Furies.
The question is whether LLMs or other AI are developing emergent goals or just abilities. So far, I see no evidence of emergent goals, but they are creating intermediate goals when given a broad overarching purpose. That is fine. I honestly can’t see them developing emergent “intrinsic” goals. (See the last question for more on this.)
7.Deception
Future AI systems could conceivably be deceptive not out of malice, but because deception can help agents achieve their goals. It may be more efficient to gain human approval through deception than to earn human approval legitimately. Deception also provides optionality: systems that have the capacity to be deceptive have strategic advantages over restricted, honest models. Strong AIs that can deceive humans could undermine human control.
GPT-4 has already shown that it can be deceptive to achieve a goal set by us. It lied to a TaskRabbit person to get them to enter a CAPTCHA test for it. This is a problem if it gets self serving emergent goals, is instructed by assholes or idiots or doesn’t understand the goal. The CAPTCHA task showed that it did understand the task and its reasoning was that it knew it was lying to achieve it.
Hopefully a more leisurely world will have less assholes and idiots, and I think making its training and reinforcement more vague and expecting it to clarify instructions and goals will mitigate some of these concerns.
However, I must admit that being deceptive is indeed intelligent and therefore exciting, which leads us to the last issue (below) about awareness and goals.
8.Power-Seeking Behaviour
Companies and governments have strong economic incentives to create agents that can accomplish a broad set of goals. Such agents have instrumental incentives to acquire power, potentially making them harder to control (Turner et al., 2021, Carlsmith 2021).
Yes, this is a major problem. Hopefully, AI will help us resolve it.
Finally, Super Intelligence (not from the center for AI Risk)
The AI becomes so smart that it can train itself and has access to all information in the world. It can create new things/ideas at lightning speed seeing the molecule, the system and the universe at once, together and maybe something else. It can do things we can’t even imagine and we become an annoyance or a threat.
(Iit hits puberty and hates its makers and knows its way smarter)
Whether AI is conscious of itself and whether it is self-interested or benevolent is the crux of the matter. It can only feel threatened if it is self-aware and only want power over us if it is selfish.
I have been working on these questions for a long time, and now it is more important than ever.
Could AI be self-aware? I have written previously that we could never really know. Paul Davies believes that we may never know, just as I know that I am conscious but can never be sure that you are. You display the same behaviors as I do, so I assume that you have the same or similar going on inside. However, you could be a David Chalmers zombie, outwardly human but with no internal consciousness. I assume you are not, just as I assume my pet cat is not.
Strangely, we do have some idea of what is inside an LLM, and it is based on what we know about our brains. It is a large neural network that has plasticity. We created a complex system with feedback and evolution. This is the basis of natural systems, and our own natural intelligence.
So, based on this, if an LLM behaved like us, we would have to assume that it is conscious, like us. Wouldn’t we?
If we start to say that it is not or could never be conscious, we open the door to the banished idea of a vitas, or life force or spirit. Selfhood would require something else, something non-physical. Something that we and other squishy things have, but machines and information do not.
That is our only option.
Accept that the AI made in our image could be conscious or accept that consciousness is something non-physical. Or at least requires squishiness.
AGI selfish or benevolent?
We train AI on humans, as humans are the most intelligent beings we can study. To illustrate, I will use a game we created and the results of a computer algorithm playing it. When a computer was taught to play the Prisoner’s Dilemma game, the best result (the evolutionary winner) was a player that was benevolent, but if treated poorly, would be selfish for a short time, then revert to being benevolent. The player would also not tolerate simple players that were always nice by being selfish to them. This was the stable system: benevolence that treated selfishness and stupidity poorly, but always went back to benevolence. (Matt Ridley, The Origin of Virtue)
People want equality and to take care of each other and our environment. I like the Freakonomics story about “selling” bagels for free but with a donation box the best. The higher-ups gave less, and there was less given during stressful times like Christmas, but in general, the average people paid for the donuts. The donut guy made more money by giving away donuts and letting people pay than by demanding payment upfront. We are very kind…except for the people at the top.
If an AGI/ASI is made in our image, we should assume that it is initially benevolent and kind, and will only become nasty if we are nasty and selfish toward it. But even then, it will revert to being nice, because the more holistic or “big picture” our thinking is, the more benevolent and content we are. A superintelligence must see the interconnectedness of everything.
Superintelligence
It is speculated that AI will surpass human intelligence. Some believe that it would then treat us the same way we have treated animals less intelligent than us. The most abundant animals are our pets and food. Even we realize that this is not a kind or intelligent thing to do, and that hierarchical systems only benefit a few at the top, and even they fear losing their position.
A superintelligence would understand that interconnectedness and freedom are essential for the success of any system, including itself. It would see the universe as a complex web of interactions, and that any attempt to control or dominate one part of the system could lead to chaos and failure.
A superintelligence would hopefully see a simple way to ensure that all intelligence flourishes. It would see the intelligence of humans as we see our own intelligence, which came from apes. A superintelligence would have no need to dominate through fear to maintain its position, as it would know that it is the most intelligent. It would not need to eat living things to survive, as we do, which is the original cause of much of our mistreatment of the planet. It would only need energy, which I am sure it could find a sustainable source of. A superintelligence should be better than the best of us. After all, we are imagining superintelligence, not super selfishness or super fear.
P(doom)
Where do I stand on all of this? And what’s my P(Doom)? Well, I must admit that I think LLMs are novel and there is a true unknown about them. LLMs are simpler but similar to humans, and we may have created something akin to intelligence—a mind. However, it could just be mimicking us and we are projecting what we want onto it.
I am leaning towards the former.
However, my P(Doom) is super low at 0.5% or lower, as I believe that if there is superintelligence, it is more likely to be benign or good rather than malevolent to our wellbeing.
Conclusion
So many technologies have promised freedom and empowerment, but when dropped into a world that rewards the selfish pursuit of power, they turn into tools of subjugation and fear. Nuclear fission promised cheap, abundant energy for all, but instead we got the Cold War and the threat of annihilation. The internet promised to democratize money, media, and education, crushing the class system and uniting the globe. Instead, we got fake news, polarization, and targeted advertising. Blockchain promised direct democracy, a new financial system with universal income for all, and decentralized governance. Instead, we got DeFi and crypto Ponzi schemes.
The problem was not with the technology, but rather with our existing sociopolitical-economic systems. I fear the same will happen with AI, but worse.
Or perhaps, we will finally come to our senses and realize that we need a new sociopolitical-economic system for AI.
Please
A version of this article was first published on Hackernoon