Home / ICV / News detail

Li Auto unveils next-gen autonomous driving architecture MindVLA

Gabriella From Gasgoo| March 18 , 2025 17:22 BJT

Beijing (Gasgoo)- On March 18, 2025, Li Auto's head of autonomous driving technology R&D, Mr. Jia Peng, delivered a keynote speech at NVIDIA GTC 2025, sharing insights into the company's latest advancements in its next-generation autonomous driving technology, MindVLA.  

Li Auto unveils next-gen autonomous driving architecture MindVLA

Photo credit: Li Auto

MindVLA is an innovative autonomous driving model based on a dual-system architecture integrating end-to-end learning and Vision-Language Models (VLM). As a new paradigm in large-scale robotic models, MindVLA endows autonomous vehicles with enhanced 3D spatial comprehension, logical reasoning, and behavior generation capabilities, allowing them to perceive, think, and adapt to dynamic environments.  

Li Auto unveils next-gen autonomous driving architecture MindVLA

Photo credit: Li Auto

Unlike a simple combination of end-to-end and VLM models, MindVLA features an entirely new design. Its 3D spatial encoder integrates language models and logical reasoning to generate driving decisions, outputting action tokens—a representation of environmental and driving behaviors. These tokens undergo further optimization via a diffusion model to determine the optimal driving trajectory in real time, all processed on-vehicle.  

Leveraging a self-developed unified cloud-based world model, MindVLA integrates 3D scenario reconstruction, generative view completion, and unseen perspective prediction to create a highly realistic simulation environment. This enables large-scale closed-loop reinforcement learning, allowing the model to continuously improve through experience. Li Auto said it has significantly optimized its world model over the past year, increasing 3D GS training speeds by over sevenfold.  

Li Auto unveils next-gen autonomous driving architecture MindVLA

Photo credit: Li Auto

MindVLA redefines the autonomous driving experience, enabling vehicles to understand and respond to voice commands in real-time. Users can issue natural language instructions, such as "Find me a supermarket" in an unfamiliar area, without predefined navigation. The vehicle will autonomously explore and locate the destination. Additionally, drivers can make real-time adjustments, such as "Slow down" or "Take the left lane," with the system understanding and executing the commands seamlessly.

Gasgoo not only offers timely news and profound insight about China auto industry, but also help with business connection and expansion for suppliers and purchasers via multiple channels and methods. Buyer service:buyer-support@gasgoo.comSeller Service:seller-support@gasgoo.com

All Rights Reserved. Do not reproduce, copy and use the editorial content without permission. Contact us: autonews@gasgoo.com