Robotics

Beyond Monotonic Progress: Retry-Supervised Value Learning for Robot Imitation

Robos News Newsroom

Editorial Desk

2026-06-24 · 2 min read

Published June 24, 2026 · Category: Robotics

Overview

arXiv:2606.24633v1 Announce Type: new Abstract: Human demonstrations for robot imitation learning often contain mistakes and corrective behaviors, such as imprecise grasps, object misalignment, unstable contact, and repeated attempts. While these segments are commonly treated as noisy or suboptimal data, they provide valuable evidence about when execution deviates from a desirable path and how task feasibility can be restored. However, existing reward and value models often rely on monotonic progress assumptions, which capture coarse task advancement but may overlook local execution errors and corrective behaviors in imperfect demonstrations. In this work, we propose ReTVL (ReTry-Supervised Value Learning), a framework for learning mistake-sensitive value functions from mixed-quality robot demonstrations by leveraging retry events as sparse supervision. ReTVL captures the local degradation-and-recovery structure around mistakes by combining global progress calibration with local pairwise preference learning induced by sparsely annotated retry keypoints. The learned value model is then used to reweight demonstration chunks for downstream behavior cloning, reducing the influence of harmful execution errors while preserving useful corrective behaviors. Experiments on real-robot manipulation tasks show that ReTVL produces more fine-grained value estimates than progress-based baselines and improves imitation learning from imperfect demonstrations.

Source

Originally published at arxiv.org.

Source: https://arxiv.org/abs/2606.24633

Robos News Newsroom

Robos News covers markets, crypto and commodities for Asia & the Middle East — tier-1 desk research, AI-driven analysis, institutional-grade data. Tip our newsroom: [email protected]

Email the newsroom →

Disclaimer: This article is for informational purposes only and does not constitute investment advice. Data may be delayed up to 15 minutes. Past performance is not indicative of future results. Consult a licensed financial advisor before making investment decisions.

Beyond Monotonic Progress: Retry-Supervised Value Learning for Robot Imitation

Overview

Source

Related Articles

Related Stories

Overview

Source

Related Articles

Related Stories

Robust.AI chooses Aptiv PULSE sensor for Gen 3 Carter mobile robot

Hirebotics offers no-code, explosion-proof cobot for painting

ARM Institute expands RoboticsCareer.org into physical AI

ForceBand: Learning Forceful Manipulation with sEMG

Cookie Preferences