Who reported this story?

This story was reported by arXiv cs.RO.

Robotics

DIM-WAM: World-Action Modeling with Diverse Historical Event Memory

Robos News Newsroom

Editorial Desk

2026-06-29 · 2 min read

Published June 29, 2026 · Category: Robotics

Overview

arXiv:2606.27677v1 Announce Type: new Abstract: World-action models have shown promising robot-manipulation performance by jointly predicting future visual states and actions. However, existing methods mainly rely on short-term history and short-horizon future prediction, which is insufficient for long-horizon tasks whose correct execution depends on earlier observations and task progress. Such temporally dependent tasks require effective use of complementary temporal information, including recent local context, cross-stage historical events, immediate future dynamics, and global task progress. To address long-term forgetting and poor awareness of the global task state, we introduce DiM-WAM, a memory-augmented world-action model that integrates multi-scale historical context, local future dynamics, and global task progress. The memory extracts compact visual event information from real observations, updates multiple memory banks through independent similarity-based merging, and then reads the bank-identity- and time-embedded long-term context to condition video and action denoising. A progress-supervision objective further encourages memory tokens to encode not only completed historical events but also the current task stage and its implications for the remaining task. On RMBench, DiM-WAM raises average success from 28.4% with LingBot-VA to 69.8%, exceeding the explicit-memory Mem-0 baseline at 42.0%. On four real-world Franka tasks, it improves average stage success from 70.7% to 91.5% and full-task success from 52.5% to 80.0%. Project page: https://wangkai-casia.github.io/dim-wam/{\texttt{https://wangkai-casia.github.io/dim-wam/}}.

Source

Originally published at arxiv.org.

Source: https://arxiv.org/abs/2606.27677

Robos News Newsroom

Robos News covers markets, crypto and commodities for Asia & the Middle East — tier-1 desk research, AI-driven analysis, institutional-grade data. Tip our newsroom: [email protected]

Email the newsroom →

Disclaimer: This article is for informational purposes only and does not constitute investment advice. Data may be delayed up to 15 minutes. Past performance is not indicative of future results. Consult a licensed financial advisor before making investment decisions.

DIM-WAM: World-Action Modeling with Diverse Historical Event Memory

Overview

Source

Related Articles

Related Stories

Overview

Source

Related Articles

Related Stories

Learning to Throw: Agile and Accurate Cable-Suspended Payload Delivery with a Quadrotor

PPO-EAL: Exact Augmented Lagrangian Proximal Policy Optimization for Safe Robotic Control

On dynamic multi-agent pathfinding methods: review, simulations and modifications

AI-Driven Synthesis for High-Tech System Design: Automating Innovation

Cookie Preferences