Gasgoo Munich-Gasgoo has learned that X Square Robot has officially launched its next-generation home robotics initiative. In a month, the first batch of robots powered by WALL-B—the company's new proprietary embodied intelligence foundation model—will move into real homes, marking the start of a journey where robots learn to serve domestic life.
Prior to this, X Square Robot partnered with 58.com to deploy robots running its WALL-AS model into actual households. Working alongside domestic cleaners, these robots achieved a global first by entering homes to assist with complex daily tasks. This marked the industry's first large-scale deployment of robots in complex consumer environments.

Image Source: X Square Robot
The standout feature of the newly released WALL-B is its architecture: it integrates vision, language, action, and physical prediction into a single network, trained jointly from scratch. This approach effectively dissolves the boundaries between modules and eliminates the latency associated with data transfer.
Built on this architecture, WALL-B delivers three core technical features that distinguish it from existing industry models:
First, native multimodality. From day one of training, WALL-B processed visual, auditory, language, tactile, and action data through synchronous annotation and joint training, achieving a "multimodal-in, multimodal-out" capability. This means the model doesn't need to translate information across different modules like a game of telephone. When it sees a cup, it prepares to reach out; when it senses weight, it adjusts its grip strength instantly.
Furthermore, WALL-B possesses an intrinsic sense of its own spatial dimensions—height, width, and reach—without needing to constantly monitor its entire body or rely on a dense array of external sensors. It can determine whether it can navigate a space or grasp an object. This is an innate spatial awareness, rather than one derived from external measurement or modeling.
Second, a physical "worldview." WALL-B can perceive and predict fundamental physical laws like gravity, inertia, friction, and velocity. In novel scenarios—such as a plate teetering half-off a table edge—the model can infer that the plate will fall and shatter, triggering preventive maneuvers.
Third, interactive self-evolution. Mainstream robots typically halt and return error codes upon failure, unable to learn from the mistake. WALL-B operates differently: after a failure, it adjusts its strategy and tries again. If successful, it updates its model parameters with that successful experience in real-time.
As robots enter the home, privacy concerns cannot be ignored. X Square Robot has addressed this with clear solutions:
Visual desensitization: The robot processes raw images in real-time directly on the device, masking personal details. Raw images never leave the unit; the robot only "sees" scene data stripped of personal identifiers.
Transparent authorization: The robot only powers on after the user actively presses a consent button. There is no "default consent"—no approval means no activation.
Restricted usage: Data is never shared with third parties. The robot recognizes only one master and immediately locks down if it detects suspicious commands.








