An LLM-Explainable DRL Framework for Passenger-Directed Autonomous Driving

Autonomous vehicles offer the potential for safer and more efficient mobility, yet public trust remains limited due to the lack of transparency in their decision-making. This work addresses this issue by combining deep reinforcement learning (DRL) for adaptive driving control with large language model (LLM)-based explainability modules designed to communicate agent behavior to passengers. DRL agents were trained in simulation using a Dueling Double Deep Q-Network to follow distinct driving requests: fast, comfort, and stop. They demonstrated stable learning, safe compliance with traffic rules, and reliable switching between modes within a single trip. In parallel, LLM modules were introduced to interpret passenger requests, determine when explanations were needed, and generate concise, safety-oriented justifications. Results show that this framework, serving as a proof of concept for integrating RL decision-making and LLMs, balances safety, adaptability, and explainability, and is most effective when requests are delayed or overridden due to safety constraints.