Gasgoo Munich- Dobot has unveiled its self-developed world action model, the Kongyi DobotWAM Embodied Large Model.
On the LIBERO benchmark for embodied intelligence, the Kongyi DobotWAM Embodied Large Model cleared four standard task suites: LIBERO-Spatial, LIBERO-Object, LIBERO-Goal, and LIBERO-10. Covering critical capabilities such as spatial understanding, object generalization, goal instruction, and long-horizon task execution, the model achieved an average success rate of 99.25%. That performance puts it ahead of public models like π0.5, π0, GR00T-N1.5, and π0+FAST, as well as other industry benchmarks with published data.

Image Credit: Dobot
Notably, the model achieved a perfect 100/100 score on LIBERO-Object, while hitting 99/100 in the Spatial, Goal, and LIBERO-10 suites.
VLA models have emerged as a key paradigm for embodied intelligence over the past two years. By unifying visual observation, language instructions, and robot actions, they demonstrate high efficiency. They excel in well-defined scenarios with ample data. Yet, an over-reliance on 2D image patterns or offline trajectory imitation can lead to action drift, goal loss, or localized success. These issues can prevent completing the broader task, especially when facing spatial disturbances, object variations, complex workflows, or real-world contact feedback.
Dobot’s approach builds on standard vision-language-action modeling by incorporating 3D spatial understanding, robotic motion geometric constraints, and a real-world data closed-loop mechanism. This allows the model to go beyond merely “mimicking actions" to actually "understanding the reasoning behind them."









