Reinforcement Learning AI commonly uses reward/penalty signals that are objective and explicit in an environment -- e.g. game score, completion time, etc. -- in order to learn the optimal strategy for task performance. However, Human-AI interaction for such AI agents should include additional reinforcement that is implicit and subjective -- e.g. human preferences for certain AI behavior -- in order to adapt the AI behavior to idiosyncratic human preferences. Such adaptations would mirror naturally occurring processes that increase trust and comfort during social interactions. Here, we show how a hybrid brain-computer-interface (hBCI), which detects an individual's level of interest in objects/events in a virtual environment, can be used to adapt the behavior of a Deep Reinforcement Learning AI agent that is controlling a virtual autonomous vehicle. Specifically, we show that the AI learns a driving strategy that maintains a safe distance from a lead vehicle, and most novelly, preferentially slows the vehicle when the human passengers of the vehicle encounter objects of interest. This adaptation affords an additional 20% viewing time for subjectively interesting objects. This is the first demonstration of how an hBCI can be used to provide implicit reinforcement to an AI agent in a way that incorporates user preferences into the control system.
Towards personalized human AI interaction - adapting the behavior of AI agents using neural signatures of subjective interest
Reinforcement Learning AI commonly uses reward/penalty signals that are objective and explicit in an environment -- e.g. game score, completion time, etc. -- in order to learn the optimal strategy for task performance.
- Year
- 2017
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1709.04574ARXIV-DEFAULT
- TL;DR
- Semantic Scholar