Project

A StarAI Approach to Safety in Learning Systems

Artificial intelligence has become ubiquitous in our daily lives. However, numerous accidents in AI research and applications have demonstrated the potential risks of AI, ranging from language models being verbally abusive to robotic systems causing physical harm. The development of AI brings an emerging concern of harming the world in various ways. Addressing the challenge of designing AI solutions that align with the intentions of human designers while avoiding undesirable side effects is an open question.

This thesis will approach this question from a logical standpoint. I propose to by formally specifying desirable safety behavior using logic formulae and incorporating them into existing learning systems. Our focus is on providing safety guarantees for learning systems in relational, stochastic and partially observable environments. We specifically explore the integration of machine learning and verification from the perspective of statistical relational AI. StarAI is a research field that focuses on the design of intelligent agents with imperfect sensors that act in a stochastic, relational world. Although safety has not traditionally been a focus of StarAI, the field has developed expressive inference and learning frameworks that could potentially be used to provide safety guarantees to learning systems. We explore how the StarAI representations and techniques can be applied and extended to provide safety guarantees to learning systems.

The first contribution is a new probabilistic model checking framework for relational models called PCTL-REBEL. By operating at the relational level instead of the propositional level, PCTL-REBEL is more efficient than existing model checkers and can check infinite models. The second contribution is a probabilistic logic framework, Probabilistic Logic Policy Gradient, for safe reinforcement learning. PLPG can be seamlessly applied to any policy gradient algorithm while still providing the same convergence guarantees, resulting in safer and more rewarding policies compared to other state-of-the-art shielding techniques. The third contribution is an efficient parameter learning technique for probabilistic logic programs. It speeds up EM-learning and enables parameter learning of multi-value random variables, which could not be directly interpreted by traditional parameter learning algorithms.

For all approaches, we presented experimental results that show their practical relevance and computational feasibility and that exploiting first-order logic and probability semantics for safety is beneficial. The implementations can be found on https://github.com/wenchiyang.

Date:24 Sep 2018 → 20 Jun 2023

Keywords:verification, artifitial intelligence

Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences

Project type:PhD project

Project

A StarAI Approach to Safety in Learning Systems

Researchers

Project partners

Funding

Publications