A latent world model built from an equivariant encoder E and an equivariant predictor f inherits a provable symmetry of its training loss: when the world's dynamics genuinely carries a group G acting on latents by an orthogonal representation ρ(g), the one-step prediction relMSE is exactly invariant across the whole group, so fitting the dynamics on a restricted slice of orientations mathematically determines it on the entire orbit (jǔ yī fǎn sān). We verify this end-to-end at laptop scale (CPU/MPS, fully seeded). [A] The symmetry survives a real Muon/AdamW + EMA + VICReg run -- composed encode-then-predict residual \sim 10^{-6} after optimisation, not just at initialisation, and under any optimiser. [B] One-step error is flat to five digits across the group, while a same-hypothesis-class non-equivariant baseline fits the slice but breaks out-of-distribution (VN \times 1.00 vs baseline \times 13.8 in 2D, \times 17.2 in 3D, \times 157 over the full SE(3) ladder), with the equivariant model 4.5-7.4\times smaller. [C] The same isometry argument lifts to closed loop: under a matching equivariant planner the control trajectory at orientation g is exactly ρ(g) applied to the seen one, so closed-loop error is invariant across the group -- float-floor-exact in 2D/SO(2) on real PushT and statistically flat in 3D/SE(3) (disjoint 95% CIs). We stress-test the prior against Sutton's Bitter Lesson: augmentation, brute-force scale, and soft-equivariance each close at most the across-group task metric, never the float-floor exactness. Because equivariance is closed under composition, the H-fold rollout stays flat (\times 1.00, \le 2\times 10^{-7}) at every horizon, while the baseline's residual compounds with H. Out of scope: task-success sweeps, planner-free invariance, and scaling.
Exact equivariance, kept through training, buys zero-shot generalisation across the symmetry group
A latent world model built from an equivariant encoder $E$ and an equivariant predictor $f$ inherits a provable symmetry of its training loss: when the world's dynamics genuinely carries a group $G$ acting on latents by an orthogonal representation $ρ(g)$, the one-step…
- Preview

- Year
- 2026
- Hosting
- Full text hostedCC-BY-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2606.03003CC-BY-4.0
- TL;DR
- Semantic Scholar