Why AI Safety is Important

This post is the second in a series on AI Safety. In this post, I argue for the importance of AI Safety, presenting the arguments I find most compelling. The next post in this series will argue the opposing view, presenting arguments against the importance of AI Safety. In a fourth post, I will contrast the two viewpoints, highlight the side I find most compelling, and indicate why.

Part 1 – What is AI Safety
Part 2 – Why AI Safety is Important
Part 3 – Why AI Safety is Not Important
Part 4 – Is AI Safety Valuable

Part 1 presented the problems that AI Safety is working to solve. AI Safety proponents make two claims about these problems.

They are important problems that need to be solved.
They should be solved by a combination of technical research and norm/institution/governance development.

I present arguments supporting these claims using a modified version of my preferred Impact Framework.

Problem Importance

Scale (Is AI Safety a big problem?)

AI Safety’s importance is directly proportional to the importance of AI. If AI never reaches massive scale, AI Safety will not become important. Conversely, if AI reaches the scale of the internet, AI Safety will be very important. AI is growing quickly, therefore, its scale needs to be considered over various timeframes to account for its rapid growth.

When I think about the current state of AI, I don’t think about it as a technology with massive scale. Global spending on AI is in the 10’s of Billions. Compare that to the tech sector which contributes ~1.2 trillion to US GDP. AI is a percent of a percent of the global economy. Despite use in many products, it has not reached a level of pervasiveness synonymous with the internet or cell phones. AI is used in applications such a fraud detection, image recognition, and natural language processing but it hasn’t found a killer app that transforms the world as we know it.

Looking forward, I expect AI to reach impressive scale in the next few years. AI is growing over 40% per year. Companies are hiring Machine Learning Engineers as quickly as they can. AI will find its first killer app in the form of self-driving cars, with many companies projecting releases in the early 2020s. The next 5 to 10 years will see AI used in more areas of our lives, with an impact that resembles the early days of the internet.

AI’s scale becomes most interesting when I think about it in the long term. Instead of predicting the state of the world at a specific time, I consider AI’s impact in the far future(I address timing later in this article). Given enough time, AI will eventually be capable of any task a human is capable of. I doubt intelligence requires a biological substrate. I also doubt humans have reached an “intelligence peak”, where it is impossible to be more intelligent than we are. I, therefore, expect AI to meet and exceed our intelligence. I can’t imagine the world that results but it is clearly one in which AI is the most important technology ever developed.

I’m drawn to the impact of AI in the far future. I can’t help but fixate on the importance of a technology that commodifies intelligence. Sure, AI has a modest impact today. It affects my day to day life only in small ways. Sometime between now and the indeterminate future, however, AI will become the most important technology ever invented. Even the internet will pale in comparison. In a world of commoditized intelligence, the importance of AI Safety can’t be understated.

Neglectedness (Is there a lack of investment in AI Safety?)

Investment in the various problems in AI Safety is mixed. Some areas of AI Safety receive significant investment while others are overlooked.

Alignment research, for example, is highly neglected. It has only 50-100 active researchers. Total spending by organizations working explicitly on AI Safety is in the low 10’s of millions per year. This seems too low when compared with the 10’s of Billions spent on general AI R&D (though it considers only organizations that work explicitly on AI Safety).

In contrast, safety areas such as privacy receive significant investment by the broader AI community. A quick search or two shows that, in 2019 alone, at least a few dozen papers have been published on privacy in Machine Learning. Anecdotally, I hear privacy discussed frequently in mainstream AI discussions, such as this machine learning podcast. Companies are naturally incentivized to think about privacy, fairness, transparency, and security as there are already norms and laws governing those areas in the broader tech ecosystem.

My impression is that some areas of AI Safety that are highly neglected. Ethics, alignment, policy, and misuse do not receive sufficient attention. Privacy, fairness, transparency, and security naturally receive moderate investment.

Three things drive this divide. The items receiving attention are problems that:

Are near term and need to be solved now.
Lend themselves to purely technical solutions.
Have norms and laws that can be applied from the broader tech ecosystem.

The neglected items are long term, non-technical, or don’t have an analog who’s norms can be easily substituted. Ethics, alignment, policy, and misuse will not be solved without a group focusing specifically on them.

Tractability (Is AI Safety solvable?)

AI Safety problems span a range of difficulties. Some areas appear quite difficult while others are at least partially solved in other areas of technology.

AI security, for example, can borrow from all of the work that has been done in securing regular computing systems. Computer security receives billions in investment each year and AI Safety can build upon that research and experience to create secure AI.

Deciding the ethics that AI should embody, in contrast, is a very difficult problem. The world is full of differing ethical frameworks with none being obviously better than all the others. How can we decide what ethics AI should embody if humanity hasn’t agreed on an ethical system? Still, there are clever ideas for having AI learn from human preferences, which could circumvent the need to explicitly select an ethical system for the AI to embody. Despite the apparent difficulty of AI ethics, progress is already being made after only a few years of research.

Value alignment is a problem of unknown difficulty because there is not an existing body of research to build upon. Alignment can borrow from game theory and some parts of economics but there are places where alignment research is laying the foundations of a new discipline. Still, concrete problems in AI Safety has shown that there are areas of value alignment that are ready for research right now. The field is new but it is making progress.

Each AI Safety area is at least moderately tractable, as shown by progress in each, despite only a few years of research in some areas. Most of the safety areas (alignment, privacy, fairness, transparency, and security) are technical problems that need to be solved to ensure the systems are not harmful. The remainder (ethics, policy, and misuse) require some technical work but are mostly about establishing norms, laws, and good governance for AI. Humanity has managed to solve both types of problems in the past, we can do so again for AI Safety.

Timing (When should AI Safety be solved?)

In Scale, I noted that the impact of AI is most interesting in the “far future”. It is clear that there are very important, neglected, long term problems in AI Safety. The question is, when should they be worked on? It could be premature to work on problems with AI that will not be significant for decades.

Many disagreements about the importance of AI Safety stem from nebulous predictions about human-level AI timelines. These predictions are then used to either justify or condemn AI Safety research. I find these discussions largely unproductive. AGI (Artificial General Intelligence) timelines are highly uncertain. Much disagreement about AI Safety arises from varied opinions about how to handle uncertainty. I address this uncertainty directly by considering AI Safety across short, medium, and long time horizons.

First, AI has many problems that need to be solved now. AI is being used in privacy-sensitive environments such as healthcare. Large companies use AI to “drive engagement” and their reliance on it has addicted users to their services. Courts have used AI to predict recidivism, but the predictions do not come with an explanation. There is no uncertainty about what to do in the short term. All of these problems need to be worked on now.

Next, AI will become more disruptive over the medium term. Self-driving vehicles are the poster child for potential problems with AI. They:

Need to be impossible to trick into crashing.
Will impact the livelihood of people who drive for work, their needs should be considered.
Will get in accidents, uncertainty about liability needs to be solved.

This one area has dozens of difficult problems to solve. We can’t predict all effects self-driving vehicles will have but that doesn’t mean we shouldn’t work to mitigate the downsides we do foresee. Consider the privacy impacts of AI in health care or the potential for misuse of autonomous drones. There are dozens of areas with dozens of problems in each. We should solve the most important problems now before something bad happens. Uncertainty means some work will be thrown away. It means that some important problems will be missed while other less important problems receive too much investment. The alternative is solving no problems upfront and hoping for the best. This isn’t how software is or should be built. Smart organizations do premortems on major projects and solve the biggest concerns upfront. Working on medium-term AI Safety problems is the equivalent of fixing problems that arose in a premortem. It’s not perfect, but it’s better than nothing.

Most disagreement about AI Safety is disagreement about what to do with the long term problems. It is not clear when powerful AI will arise. It could arrive in the next decade or still not be here in a century. Powerful AI might look nothing like the AI that exists today or it might be built with techniques that are already well known. The impact of AI Safety research focused on the long-term is highly uncertain. It might all be thrown away or it might be essential to ensuring safe AI.

My answer to this uncertainty is that we should proceed cautiously. If we do no work on AI Safety, the potential upside is large but the potential downside is just as big. If we choose to invest in AI Safety, the potential downside is limited and the potential upside is minimally affected. Papers such as concrete problems in AI Safety present broad problems with AI that can be worked on now. As AI becomes more powerful and as known problems are solved, new problems will be discovered, prioritized, and worked on in turn.

AI Safety is overflowing with problems that can be worked on right now. There are well-defined problems across all timespans. Some work will be thrown away, but that is the case with all research. Arguments about timelines are mostly arguments about uncertainty. The best way to act in such an uncertain situation is to invest some resources in AI Safety, mitigating the potential downside as much as possible. (Aside: This is a bit like The Worst-Case Fallacy or Pascal’s Mugging but I believe the possibility of AGI in our lifetimes is non-negligible and therefore support investment in AI Safety)

Impact

AI Safety is valuable only if the solutions it proposes will make a significant improvement from the status quo. AI Safety proposes two categories of solutions that I find compelling:

Technical Research: Solve technical problems in ethics, alignment, security, etc.
Institution Building: Establish norms, standards, policies, and governance.

I can image other “solutions”. For example, governments could decide that developing AI is too risky and ban research entirely. This has been done in the past for areas such as bio-weapons research. I think AI will bring massive benefits to humanity and therefore do not want it banned.

Historical Impact (Will AI Safety have a large impact compared to the next best solution?)

Normally, I would look at the impact of previous solutions to a problem and compare that with the proposed solution. In this case, AI is nascent and AI Safety is even more so. There is not a long history of attempted solutions to compare with the proposed solutions. Instead, I look at instructive historical analogies which give insight into the impact AI Safety research might have.

Computer Security

Early operating systems (UNIX/Windows) were not designed with security in mind. The internet suffers from the same problem, it wasn’t designed for security first. The result: cybercrime costs $600+ Billion per year. Computer security didn’t have to be this bad.

Research into provably secure operating system‘s was performed as early as the 1970s. Today, some small operating systems claim to be provably secure. If the world ran these operating systems and they were even a single order of magnitude more secure, $100’s of billions would be saved each year. Nearly every computer in the world runs a variant of Windows or UNIX. Switching over billions of computers now would cost far more than dealing with the expense of cybercrime today.

This analogy to computer security brings three insights.

First, it is possible to foresee major problems, research them, and solve the problem before it becomes widespread. AI Safety can follow the lead of security researchers of the 1970s and work on problems with AI before they are too large to be solved. Fixing the problems now and building robust AI will save $100’s of Billions in future expenses.

Second, now might be our only opportunity to solve problems with AI. As with computer security, it will be too expensive to replace ubiquitous unsafe systems with safe ones.

Third, research alone is insufficient. Research only matters if it is actually used. Researchers designed provably secure operating systems in the ’70s but the operating systems that went mainstream did not implement that research. If AI Safety research is not implemented when powerful AI is built, unsafe AI will be widely distributed. Establishing norms and governance are every bit as essential as research and must be a priority of the AI Safety community.

Nuclear Technology

The first nuclear bomb was tested in 1945. Nuclear chain reactions, the key insight that enabled the nuclear bomb, was first envisioned in 1933. By 1947, the cold war started. Shortly thereafter, in 1949, the Soviet Union tested its first nuclear bomb and the nuclear arms race began.

Imagine a world where, in the years from 1933 to 1945, a group of politicians, scientists, etc was building norms, institutions, and governance aimed at preventing a nuclear arms race. At the time, this group would have been ridiculed. Had they been successful, humanity would have avoided the precarious position it was in during the cold war and is still in today.

The nuclear analogy is instructive. Anyone predicting doom is likely to be shamed as a Luddite and scaremonger. In 1933, virtually no-one could have predicted a nuclear arms race. 16 years later, it was a reality. Right now, it seems unlikely that an AI Arms race will start in a mere 16 years, but predicting so far into the future is extremely difficult. Today, we would praise anyone who fought to prevent a nuclear arms race. In the future, we might do the same for those who kept AI safe.

These analogies make me hopeful. Research and norm building can solve problems with AI. From a technical perspective, operating system security is largely a solved problem. Likewise, we can solve the technical problems in AI Safety. From a norms/institutions perspective, the risk of nuclear war is significantly lower than it was during the cold war. We can build upon success in establishing norms for nuclear weapons and create better norms and institutions for governing AI. Working on these problems now will save trillions of dollars and significantly reduce the risk of a catastrophe involving AI. The potential impact is massive.

Learning Value

Starting AI Safety work now will reveal what more can be done. It is the fastest way to discover the value of the proposed solutions. Early work will build expertise in AI Safety and enable those experts to keep up with changes in the broader AI community. It is the best way to discover alternative solutions to research and governance. The potential learning value is at least moderately large.

Confidence

My confidence in the conclusions above is low. There is too little AI Safety research to look back at and determine its impact. Most of the impact is focused on the future. Analogies to other fields are instructive but not conclusive.

My biggest concerns with the conclusions above are:

Normal AI researchers have a natural incentive to solve some of these problems. They are likely to work on many items such as privacy without prompting from the AI Safety community. AI researchers do not have “build unsafe AI” as a goal so it could be that these issues don’t require a separate AI Safety community.
I may be overestimating the size of the problems. Is AI actually an extinction level risk? Will AI be as important as the invention of computers/the internet? It’s not clear yet.
Are research and norm building good solutions to the problems? Research is easy to ignore. Norms can be violated. The only sure-fire solution to the problems of AI is to not build AI, but that would mean foregoing a massively beneficial technology.

Conclusion

The conclusion I draw from all of these arguments is that society should prioritize AI Safety research at least moderately high. I would summarize my view as:

AI and AI Safety are going to be very important in the future. They are moderately important now.
Alignment, ethics, policy, and misuse do not receive sufficient attention. Privacy, fairness, transparency, and security receive some attention (but maybe not enough).
AI Safety research + norm building are probably solvable but the best way to find out is to work on them.
The risk from AI is very uncertain with a large potential downside. We should proceed cautiously and make at least a moderate investment into mitigating the large downside.
Starting now is the best way to learn.
I am not confident in my conclusions.

This is not a satisfying conclusion. I would prefer to have a well-formulated argument either for or against AI Safety. The most important takeaway is that there is a lot of uncertainty in AI Safety and proceeding cautiously is wise.

P.S. Arguments I do not Find Compelling

There are more arguments for why AI Safety is important than the ones I presented above. I won’t go into much detail but some arguments I do not find compelling include:

AI Safety is like home insurance.
- I don’t think it is like insurance. It is contingency planning. A technique that is common in military and corporate environments where you plan for the unexpected as a method of reducing the impact of various risks.
The value of the future is nearly infinite. A tiny probability of extinction justifies an infinite investment mitigating that risk.
- This is a bad way to assess risk. The specific numbers matter. If the probability of extinction from AI is 10^-50, it won’t happen before the heat death of the universe and is a waste of resources.
Superintelligent AI can think faster and can, therefore, make discoveries much faster than we can. Ex. Superintelligent AI will cure diseases because it can reason faster than we can.
- We are limited by the rate at which we can do experiments more than we are limited by the rate at which we can come up with new theories, especially in healthcare. It doesn’t matter how fast you can think if the bottleneck is experimentation. Benign Superintelligent AI will have to go through the FDA (or similar) just like regular human researchers. Even if it doesn’t, experiments on disease run at the rate of human lifetimes. They can’t be sped up.