Multimodal Large Language Model Driven Scenario Testing For Autonomous Vehicles

Lu Qiujing, Wang Xuanhan, Jiang Yiwei, Zhao Guangming, Ma Mingyue, Feng Shuo. Arxiv 2024

[Paper]
Multimodal Models Prompting RAG Reinforcement Learning Tools

The generation of corner cases has become increasingly crucial for efficiently testing autonomous vehicles prior to road deployment. However, existing methods struggle to accommodate diverse testing requirements and often lack the ability to generalize to unseen situations, thereby reducing the convenience and usability of the generated scenarios. A method that facilitates easily controllable scenario generation for efficient autonomous vehicles (AV) testing with realistic and challenging situations is greatly needed. To address this, we proposed OmniTester: a multimodal Large Language Model (LLM) based framework that fully leverages the extensive world knowledge and reasoning capabilities of LLMs. OmniTester is designed to generate realistic and diverse scenarios within a simulation environment, offering a robust solution for testing and evaluating AVs. In addition to prompt engineering, we employ tools from Simulation of Urban Mobility to simplify the complexity of codes generated by LLMs. Furthermore, we incorporate Retrieval-Augmented Generation and a self-improvement mechanism to enhance the LLM’s understanding of scenarios, thereby increasing its ability to produce more realistic scenes. In the experiments, we demonstrated the controllability and realism of our approaches in generating three types of challenging and complex scenarios. Additionally, we showcased its effectiveness in reconstructing new scenarios described in crash report, driven by the generalization capability of LLMs.

The Large Language Model Bible

Multimodal Large Language Model Driven Scenario Testing For Autonomous Vehicles

Lu Qiujing, Wang Xuanhan, Jiang Yiwei, Zhao Guangming, Ma Mingyue, Feng Shuo. Arxiv 2024

Similar Work