Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Active Learning for Reward Estimation in Inverse Reinforcement Learning
| |
|
Active Learning for Reward Estimation in Inverse Reinforcement Learning
Manuel Lopes22 , Francisco Melo23 and Luis Montesano24 
| (22) |
Instituto de Sistemas e Robótica - Instituto Superior Técnico, Lisboa, Portugal |
| (23) |
Carnegie Mellon University, Pittsburgh, PA, USA |
| (24) |
Universidad de Zaragoza, Zaragoza, Spain |
Abstract
Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided
by an expert/demonstrator. In this paper, we introduce active learning for inverse reinforcement learning. We propose an algorithm that allows the agent to query the demonstrator for samples at specific states, instead of relying only on samples provided at “arbitrary” states. The purpose
of our algorithm is to estimate the reward function with similar accuracy as other methods from the literature while reducing
the amount of policy samples required from the expert. We also discuss the use of our algorithm in higher dimensional problems,
using both Monte Carlo and gradient methods. We present illustrative results of our algorithm in several simulated examples
of different complexities.
Work partially supported by the ICTI and FCT, under the CMU-Portugal Program, the (POS_C) program that includes FEDER funds
and the projects ptdc/eea-acr/70174/2006, (FP6-IST-004370) RobotCub and (FP7-231640) Handle.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|