What is AI Safety?

Preamble

Artificial Intelligence may be the most important technology of our time. It simultaneously has the potential to be extraordinarily good or exceptionally bad. First, imagine a world without scarcity, disease or possibly even aging. It sounds too good to be true but that’s the world that the most optimistic futurists think AI will enable. Next, imagine a world of unprecedented inequality, unbreakable authoritarian regimes, or even an extinct human race. That’s the pessimists’ view of how bad AI might be..

There is an emerging AI Safety Community which holds some combination of these views simultaneously. This community is composed of scientists, philanthropists, technologists, philosophers, and others. They share a goal of ensuring that AI brings prosperity, not a catastrophe.

This article is the first in a series exploring AI Safety. In it, I take a shallow look at the problems the AI Safety community is trying to solve. Future posts will take a deeper look at questions such as:

  • Why is AI Safety an important problem? Why does there need to be a dedicated community?
  • For each problem area, what are the arguments for or against it being an AI Safety focus area?

Part 1 – What is AI Safety
Part 2 – Why AI Safety is Important
Part 3 – Why AI Safety is Not Important
Part 4 – Is AI Safety Valuable

What is AI Safety

AI Safety asserts that AI can be beneficial or detrimental and, without working to make it beneficial, it will be detrimental by default. In the same way that a poorly designed building might collapse and harm thousands, a poorly designed self-driving car might cause many crashes and harm thousands. AI Safety is work and research that ensures AI is beneficial, not harmful.

Safety Areas

AI can be harmful in a variety of ways. The challenges associated with beneficial AI fall into the categories below. These are the focus areas within the AI Safety Community..

Privacy
AI relies on data to learn. This reliance on data comes with the potential for privacy violations.

Facial recognition can be used to track people in public. Other Machine Learning (ML) techniques can be used to track people online. ML can be used to de-anonymize apparently anonymous data. In each of these cases, a person is being tracked without consent.

Machine Learning models learn from data. If that data is medical records, it may be possible for a clever user to get the system to reveal intimate details about another persons health.

AI Safety asks: How do we build AI that doesn’t violate individuals privacy?

Fairness
Machine Learning is often used in decision making. If fairness is not prioritized, the algorithm can learn to be biased.

AI that calculates a person’s creditworthiness might discriminate against minority groups, systematically denying members of that minority credit when they would otherwise be worthy.

ML tools developed to assist judges with making sentencing decisions have already been shown to be biased. These biases perpetuate an unfailre status quo.

AI Safety asks: How do we build AI that is unbiased and doesn’t systematically discriminate against underprivileged groups?

Transparency
Machine Learning systems are difficult to interrogate. In many cases, it’s impossible to determine how a machine learning algorithm made a decision or why it took an action.

When used in decision making, it’s often unclear why AI made a decision. A model that decides who can be released on bail can’t explain why it decided yes or no.

When AI fails to complete a task as expected, it’s often unclear why. Facial recognition that fails to recognize a user because they are wearing a hat can’t explain that the hat is the problem.

AI Safety asks: How do we build AI that can explain its decisions? How do we build AI that can explain why it made the wrong decision?

Misuse
As with any tool, AI can be misused by malicious people. (See https://wyaber.com/ai-tools-tradeoff/)

A system that can write news articles could be repurposed to write fake news.

Machine Learning might be used in the development of autonomous weapons. These weapons would pose a significant risk in the hands of terrorists. If such systems are developed, what stops them from being used by someone other than the intended user?

AI Safety asks: How do we ensure that AI is only used for good causes?

Security
As with any software, AI is vulnerable to security attacks. A sophisticated, malicious actor can trick AI into failing its assigned toask or taking unintended actions.

Existing object recognition models can be tricked into misclassifying images. A bus might be recognized as an ostrich. A security system that is supposed to detect intruders could be tricked into ignoring a person in the video feed.

A Q&A messaging bot at a bank might be tricked into transferring money into an attackers account.

AI Safety asks: How do we prevent malicious actors from abusing imperfect AI Systems?

Policy
AI is already having a large impact on our society. There are a plethora of policy questions that will be raised by AI in the coming years.

AI might increase concentrations of power and undermine democracy. Large corporations with strong AI systems gain power by amassing vast datasets upon which AI is trained. This benefits the corporations but not the people the data was collected from. 

AI may put millions of people out of work. Trucking is among the largest professions’ in the U.S. but most of that work will be replaced by self-driving trucks.

AI Safety asks: How do we ensure that AI benefits all, not only a few? How do we handle the disruptions that will be caused by its development?

Ethics
Powerful AI Systems will be given broad goals that require knowledge of human values. We need to know what our values are before they can be taught to the AI.

AI will be asked to evaluate who gets access to medical treatment. How should it weigh the value of saving a life against adding 1 year of life to 50 people?

AI will make tradeoffs between the expense of completing a task and the amount of pollution that is created. How should it weigh the value of money against negative environmental impact?

AI Safety asks: How do we decide the values that AI promotes?

Control/Alignment
AI which is not aligned with our values will fail to understand our intentions and may be surprisingly destructive.

AI might “do too much of some thing”. A demolition AI asked to “clear some space” might try to level an entire city. It’s clearing space, but more than intended.

AI might “do the right thing but in the wrong way”.  Medical AI asked to eliminate cancer might decide to murder all humans. It did the right thing (no more cancer), but it did it in a bad way.

AI Safety asks: How do we align AI with our values so that it does what we intend, not what we ask?

A Large Problem Set

As you can see, the AI Safety community has its work cut out for it. How exactly do we teach AI our values when humanity hasn’t agreed on a single set?

The Safety Community asked what was missing in AI development and found the large set of hard problems above. As AI moves from the research lab into systems that affect real people, there needs to be a greater investment in ensuring AI is not harmful. The problem set above is a focus for developing beneficial AI.

Afterword

This post was an attempt to understand what AI Safety is. In the next post, I will try to answer why AI Safety matters.I’ll look at things like:

  • If a small number of people are impacted by the problems above, solving them may not be worth the investment.
  • If the broader AI community will solve them without the focus provided by the Safety community, there’s no need for a Safety community.
  • If these problems won’t arise in the near future, it might be better to leave them unsolved for now.

References

Some of the resources I found most helpful as I wrote this.
https://futureoflife.org/ai-principles/
https://80000hours.org/problem-profiles/positively-shaping-artificial-intelligence/
https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
https://ai-alignment.com/ai-safety-vs-control-vs-alignment-2a4b42a863cc
https://www.technologyreview.com/s/612775/algorithms-criminal-justice-ai/
https://arxiv.org/abs/1606.06565
https://www.wired.com/story/machine-learning-identify-anonymous-code/
https://arxiv.org/abs/1809.06061
https://arxiv.org/abs/1802.08195
https://wyaber.com/ai-tools-tradeoff/
https://medium.com/josh-cowls/ai-and-the-trolley-problem-problem-ef48582b49bf
https://www.amazon.com/Superintelligence-Dangers-Strategies-Nick-Bostrom/dp/1501227742

Leave a Reply