Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving
Z. Cao, E. Biyik, W. Z. Wang, A. Raventos, A. Gaidon, G. Rosman, D. Sadigh
Published in RSS 2020 - July 2020
Links: RSS 2020 page, video, arxiv, bibtex
Abstract
Autonomous driving has achieved significant progress in recent years, but autonomous cars are still unable to tackle high-risk situations where a potential accident is likely. In such near-accident scenarios, even a minor change in the vehicle’s actions may result in drastically different consequences. To avoid unsafe actions in near-accident scenarios, we need to fully explore the environment. However, reinforcement learning (RL) and imitation learning (IL), two widely-used policy learning methods, cannot model rapid phase transitions and are not scalable to fully cover all the states. To address driving in near-accident scenarios, we propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes. Our approach exploits the advantages of both IL and RL by integrating them into a unified learning framework. Experimental results and user studies suggest our approach can achieve higher efficiency and safety compared to other methods. Analyses of the policies demonstrate our high-level policy appropriately switches between different low-level policies in near-accident driving situations.
Video
RSS 2020 spotlight talk
Supplementary video
Bibtex
@inproceedings{cao2020reinforcement,
title={Reinforcement Learning based Control of Imitative Policies
for Near-Accident Driving},
author={Cao, Zhangjie and Biyik, Erdem and Wang, Woodrow and Raventos, Allan and
Gaidon, Adrien and Rosman, Guy and Sadigh, Dorsa},
booktitle={Robotics: Science and Systems XVI},
year={2020},
month={Jul},
url={http://dx.doi.org/10.15607/rss.2020.xvi.039},
DOI={10.15607/rss.2020.xvi.039},
}