< Back to previous page

Project

Leveraging environment structure to learn reusable skills for autonomous agents (FWOTM821)

A Reinforcement Learning agent learns to best behave in its environment by repeatedly performing actions and observing the results. The task to be solved by the agent is expressed using a reward signal, and the agent learns which actions to perform in which conditions in order to maximize it. Reinforcement Learning agents are able to learn even if there is a delay between an action and its effect on the reward signal, but learning becomes much slower as this delay increases. This delay usually comes from the fact that the task to be solved is very complicated and only successes or fails (thus producing a positive/negative reward) at the very end.Hierarchical Reinforcement Learning consists of dividing a task into simpler sub-tasks that are easier to learn. This divide and conquer approach translates to a more informative reward signal, with less delay, as the agent receives a reward any time it completes a sub-task. This largely speeds up learning, much like identifying intermediate goals allows people to better grasp a complex task.We propose to apply Hierarchical RL to complex but structured problems in order to speed up learning, and allow the agent to quickly adapt to changes in its environment by reusing skills it already masters. We will design original algorithms that allow an agent to discover structure and intermediate goals in a problem, identify similar sub-tasks in order to generalize its knowledge, and learn how to best accomplish new tasks.
Date:1 Oct 2016 →  31 Dec 2020
Keywords:computer science
Disciplines:Numerical analysis