Robotics

AIR-VLA+: Decoupling Movement and Manipulation via Cascaded Dual-Action Decoders with Asymmetric MoE for Aerial Robots

Robos News Newsroom

Editorial Desk

2026-06-12 · 2 min read

Published June 12, 2026 · Category: Robotics

Overview

arXiv:2606.12859v1 Announce Type: new Abstract: Aerial manipulation systems have long suffered from representation coupling in end-to-end control, as platform-level Unmanned Aerial Vehicle (UAV) movement and end-effector-level arm manipulation differ substantially in action scale, dynamics, and control objectives. In this paper, we propose AIR-VLA+, a flow matching action generation architecture specifically designed for aerial manipulation, featuring cascaded dual-action decoders and an asymmetric feature-level Mixture of Experts (MoE). We construct cascaded manipulation and movement decoders, allowing the UAV to unidirectionally observe the manipulator's intent during movement to achieve workflow coordination, while isolating the impact of UAV movement information backpropagation on arm manipulation stability. Addressing the characteristic that UAV movement is highly dependent on high-level semantics and responsible for task state transitions in aerial manipulation, we design an input feature enhancement module for the UAV movement decoder. This module introduces an implicit visual grasp projector to perceive the interaction state between the gripper and the object, and injects compressed global semantic features. Within the UAV movement decoder, we deploy an implicit MoE architecture, enabling different movement experts to spontaneously exhibit capacity inclinations for various task stages during training. Through dense soft blending computation on the feature manifold, the UAV movement is endowed with stronger task-stage adaptability. Experiments on the standardized AIR-VLA benchmark demonstrate that our method comprehensively surpasses all baselines with an overall average score of 48.0. The overall task completion score improves by 80.2\% compared to the single-head $\pi_{0.5}$ policy, effectively mitigating the heterogeneous coordinated control conflicts of composite robots.

Source

Originally published at arxiv.org.

Source: https://arxiv.org/abs/2606.12859

Robos News Newsroom

Robos News reports on robotics research, components, manufacturers, field deployments, and industrial automation worldwide. Tip our newsroom: [email protected]

Email the newsroom →

Reporting standard: Product specifications, deployment counts, and performance claims are attributed to their source. Safety-critical decisions should be based on the applicable technical documentation and validation for the operating environment.

Cookie Preferences

Overview

Source

Related Articles

Related Stories

Soft robotic heart offers new way to study disease and test life-saving devices

Sling2Sim2Real: One-Shot Elastic System Identification for Non-Destructive Slingshot Policy Learning

Continual-RL for Generalization in Autonomous Racing on the RoboRacer Platform

A Case Study on the Acceptance of a Humanoid Robotic Head Employed in Three Public Spaces

Cookie Preferences