Workshop on Reliable Artificial Intelligence 2017


With recent rapid progress in machine learning and artificial intelligence, expert attention is increasingly turning to the impact of these fields on society. An awareness is growing of the potential for serious incidents to occur through, for example, design error, badly-specified objectives, or plain misapplication of AI.

Multiple institutions both public and private are therefore working on developing improvements to existing AI systems to make them more secure against such failure modes, or on foundational extensions of the field of AI to allow construction of agents which are secure by design. These institutions include OpenAI, DeepMind, the Future of Humanity Institute at the University of Oxford, and the Machine Intelligence Research Institute.

Bringing a taste of this research to Switzerland, the Workshop on Reliable Artificial Intelligence was held at ETH Zürich on the 28th of October, 2017. Featuring talks from

  • Max Daniel (Effective Altruism Foundation),
  • Victoria Krakovna (Google DeepMind),
  • Owain Evans (University of Oxford),
  • Felix Berkenkamp (ETH Zürich),
  • Will Sawin (ETH Zürich) and
  • Bas Steunebrunk (NNAISENSE),

the workshop brought together students and researchers for a day of discussion on technical aspects of building safe artificial intelligence.

See below for selected talks and slides.


An Overview of the AI Safety Landscape

Max Daniel

Recent years have seen a surge of interest in exploring the societal consequences of increasingly autonomous and capable artificial intelligence (AI) systems. Next to understanding potential economic and legal challenges associated with AI, a growing body of work in technical AI research and machine learning aims to provide the engineering and design foundations for ensuring that future AI systems will remain safe and beneficial. This talk gives an overview of this thriving field of AI safety, with a focus on the following questions. What are the technical problems addressed by AI safety research, and how do they relate to both short-term and long-term risks and benefits of AI? Who are the key actors funding and conducting AI safety research? How can interested students and researchers get involved in AI safety research?

Slides. See also the slides from the closing remarks given by Max Daniel, containing information on groups doing technical AI safety research and other useful resources.

Growing Robust & Safe AI: Let's be Realistic

Bas Steunebrink

Learning values and ethics under pragmatic real-world constraints requires a new developmental approach to training AIs. This approach places significantly more responsibility on the "teachers" of AI than on the designers and builders. For an AI to move from brittle to robust understanding of what the referents of (possibly evolving) ethics specifications really mean, it needs to properly ground its knowledge in the pragmatics of the world. Therefore the developmental approach entails that we not only figure out which things to teach in which order, but also how to measure progress, such that it becomes feasible to ultimately certify the robustness of the AI to predict ethics violations and report and act accordingly.


Reinforcement Learning with a Corrupted Reward Channel

Victoria Krakovna

No real-world reward function is perfect. Reward misspecifications, sensory errors, and software bugs may result in RL agents observing higher (or lower) rewards in some states than they should, which can lead to undesirable or dangerous behavior. We formalize this problem as a generalized Markov Decision Problem, and show that traditional RL methods fare poorly in this setting, even under strong simplifying assumptions and when trying to compensate for the possibly corrupt rewards. We develop an abstract framework for giving the agent richer data by cross-checking reward information between different states. This framework encompasses inverse reinforcement learning and semi-supervised reinforcement learning, and helps the agent overcome reward corruption under some assumptions.

Slides; paper

Safe Reinforcement Learning in Robotics with Bayesian Models

Felix Berkenkamp

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, most reinforcement learning algorithms rely on random exploration to find optimal policies, which may be harmful in real-world systems such as robots. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this talk, we show how the uncertainty information in Bayesian models can be used to make safe and informed decisions both in policy search and model-based reinforcement learning. Moreover, we show how these algorithms can be applied to physical quadrotor vehicles.



The Workshop on Reliable Artificial Intelligence was organized by MIRIxZürich. Contact Marko Thiel, Michal Pokorný or Matthew Rahtz at