Research
Frontier robotics research: arXiv cs.RO papers, embodied AI, VLA models, manipulation, navigation and learning systems.
Frontier robotics research: arXiv cs.RO papers, embodied AI, VLA models, manipulation, navigation and learning systems.
Latest in Research
30 storiesEWAM: An Enhanced World Action Model for Closed-Loop Online Adaptation in Embodied Intelligence
arXiv:2606.12690v1 Announce Type: new Abstract: In this paper, we propose the Enhanced World Action Model (EWAM), a closed-loop online adaptation architecture b…
EquiDexFlow: Contact-Grounded SE(3)-Equivariant Dexterous Grasp Generative Flows
arXiv:2606.12728v1 Announce Type: new Abstract: Most learned dexterous grasp generators relegate contact forces to a downstream verification step, so a kinemati…
Sparse2Act: Learning Action-Aligned Sparse 3D Representations for Cross-Domain Robot Manipulation
arXiv:2606.12759v1 Announce Type: new Abstract: Explicit 3D representations are attractive for manipulation because they expose object shape, workspace geometry…
EmbodiSteer: Steering Embodiment-Agnostic Visuomotor Policies with Joint-Space Guidance for Zero-Shot Cross-Embodiment Deployment
arXiv:2606.12965v1 Announce Type: new Abstract: Scalable robot imitation learning relies on large-scale heterogeneous data from diverse robots or body-free data…
Y-BotFrame: An Extensible Embodied Agent Framework for Quadruped Robot Assistants
arXiv:2606.13049v1 Announce Type: new Abstract: Quadruped robots are capable of traversing a wide range of complex terrains with high flexibility. As highly mob…
EA-WM: Event-Aware World Models with Task-Specification Grounding for Long-Horizon Manipulation
arXiv:2606.13053v1 Announce Type: new Abstract: Pretrained-feature world models provide a useful substrate for robot imagination, but visual or latent predictio…
Redesigning Regularization for Effective Policy Smoothing
arXiv:2606.13169v1 Announce Type: new Abstract: This paper proposes a novel regularization design to effectively smooth policy functions in reinforcement learni…
WT-UMI: Tactile-based Whole-Body Manipulation via Force-Supervised Contact-Aware Planning
arXiv:2606.13232v1 Announce Type: new Abstract: Whole-body humanoid manipulation of bulky, deformable, and shared-load objects requires distributed contact sens…
MCR-Bionic Hand: Anatomical Structural Priors for Dexterous Manipulation
arXiv:2606.13601v1 Announce Type: new Abstract: Dexterous robotic hands are usually formulated as high dimensional active control systems governed by degrees of…
Scale Buys Interpolation, Structure Buys a Horizon: Certified Predictability for Equivariant World Models
arXiv:2606.13092v1 Announce Type: cross Abstract: Scale buys interpolation; structure buys a certified horizon. A world model's average error says nothing about…
MaskWAM: Unifying Mask Prompting and Prediction for World-Action Models
arXiv:2606.13515v1 Announce Type: cross Abstract: World Action Models (WAMs) present a promising paradigm for robotic control via video prediction. However, cur…
LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories
arXiv:2606.13578v1 Announce Type: cross Abstract: Scientific laboratories increasingly rely on AI systems to reason about experiments, but the physical act of d…
Learning Robot Safety from Sparse Human Feedback using Conformal Prediction
arXiv:2501.04823v2 Announce Type: replace Abstract: Ensuring robot safety can be challenging; user-defined constraints can miss edge cases, policies can become …
$\texttt{WEAVER}$, Better, Faster, Longer: An Effective World Model for Robotic Manipulation
arXiv:2606.13672v1 Announce Type: new Abstract: The potential impacts of world models (WMs, i.e., learned simulators) on robotics are far-reaching -- policy eva…
NavWAM: A Navigation World Action Model for Goal-Conditioned Visual Navigation
arXiv:2606.13494v1 Announce Type: new Abstract: Goal-conditioned visual navigation requires a robot to act under partial observability by anticipating how its m…
Humor Style Drives Laughter, Topic Shapes Acceptability: Evaluating Bilingual Personal and Political Robot-Delivered AI Jokes
arXiv:2606.13256v1 Announce Type: new Abstract: Humor plays a central role in human social relationships, and recent advances in computational humor create new …
Embedding ISO 10218 Safety Compliance in Robots via Control Barrier Functions for Human-Robot Collaboration
arXiv:2606.13203v1 Announce Type: new Abstract: Human-Robot Collaboration (HRC) requires strict adherence to safety standards, such as ISO 10218, to prevent har…
Comparing Commercial Depth Sensor Accuracy for Medical Applications
arXiv:2606.13028v1 Announce Type: new Abstract: Depth estimation has numerous medical and surgical applications. We benchmark four depth sensors on a porcine bo…
SERF: Spatiotemporal Environment and Robot Feature Map for Long-Horizon Mobile Manipulation
arXiv:2606.12956v1 Announce Type: new Abstract: Long-horizon robot mobile manipulation requires continual reasoning about localization, environment changes, and…
Bounding Boxes as Goals: Language-Conditioned Grasping via Neuro-Symbolic Planning
arXiv:2606.12910v1 Announce Type: new Abstract: For robotics to be effectively integrated into household or industrial environments, machines must adapt to natu…
Learning to Adapt: Representation-Based Reinforcement Learning for Multi-Task Skill Transfer
arXiv:2606.12890v1 Announce Type: new Abstract: Reinforcement learning has achieved remarkable success in learning complex control policies, yet its applicabili…
AIR-VLA+: Decoupling Movement and Manipulation via Cascaded Dual-Action Decoders with Asymmetric MoE for Aerial Robots
arXiv:2606.12859v1 Announce Type: new Abstract: Aerial manipulation systems have long suffered from representation coupling in end-to-end control, as platform-l…
EgoEngine: From Egocentric Human Videos to High-Fidelity Dexterous Robot Demonstrations
arXiv:2606.12604v1 Announce Type: new Abstract: Dexterous manipulation is limited by the cost of collecting large-scale robot demonstrations. Egocentric human v…
Active Semantic Perception
arXiv:2510.05430v2 Announce Type: replace Abstract: We develop an approach for active semantic perception, which refers to using the semantics of the scene for …
ReactEMG Stroke: Healthy-to-Stroke Few-shot Adaptation for sEMG-Based Intent Detection
arXiv:2601.22090v2 Announce Type: replace Abstract: Surface electromyography (sEMG) is a promising control signal for assist-as-needed hand rehabilitation after…
SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models
arXiv:2602.04208v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control…
Adaptive-Horizon Conflict-Based Search for Closed-Loop Multi-Agent Path Finding
arXiv:2602.12024v2 Announce Type: replace Abstract: MAPF is a core coordination problem for large robot fleets in automated warehouses and logistics. Existing a…
AssemLM: A Spatial Reasoning Multimodal Large Language Model for Robotic Assembly
arXiv:2604.08983v2 Announce Type: replace Abstract: Spatial reasoning is a fundamental capability for embodied intelligence, especially for fine-grained manipul…
From Digital to Physical: Digital Agents as Autonomous Coaches for Physical Intelligence
arXiv:2601.21570v2 Announce Type: replace-cross Abstract: The field of Embodied AI is witnessing a rapid evolution toward general-purpose robotic systems, fuele…
Triangle Splatting SLAM
arXiv:2605.31419v2 Announce Type: replace-cross Abstract: We present a dense RGB-D SLAM system using differentiable triangles as the 3D map representation. Whil…