🤖 Humanoid 🦾 Industrial & Cobot 🚚 AGV / AMR 🐕 Quadruped ⚙️ Reducers · Servos · Sensors 🚁 Drones & Autonomy 🧠 Embodied AI
Robos News
Robotics

Self-Driving Negotiator: An interactive, verifiable benchmark for social negotiation and theory of mind under hidden intent

arXiv:2606.15139v1 Announce Type: cross Abstract: Autonomous driving is full of tiny social negotiations: a driver presses forward, another yields, a pedestrian fakes toward the curb, or a lane vehicle chooses whether to open a merge gap. Such interactions require inferring hidden intent from behavior under partial observability and then acting safely and efficiently. Existing autonomous-driving language benchmarks mostly focus on perception, visual question answering, or open-loop planning, wh

Self-Driving Negotiator: An interactive, verifiable benchmark for social negotiation and theory of mind under hidden intent

Published June 16, 2026 · Category: Robotics

Overview

arXiv:2606.15139v1 Announce Type: cross Abstract: Autonomous driving is full of tiny social negotiations: a driver presses forward, another yields, a pedestrian fakes toward the curb, or a lane vehicle chooses whether to open a merge gap. Such interactions require inferring hidden intent from behavior under partial observability and then acting safely and efficiently. Existing autonomous-driving language benchmarks mostly focus on perception, visual question answering, or open-loop planning, while existing language-agent negotiation benchmarks typically make the negotiation explicit in text. Self-Driving Negotiator bridges the gap between the two: a text-only, multi-turn, procedurally generated environment for measuring implicit social coordination in driving. Agents generate specific driving actions. Reward and diagnostics are computed from the privileged simulator state, not from the explanation of the model. This report covers task design, reward and anti-gaming invariants, validated scenarios, non-LLM baselines, and a six-model inference leaderboard. Current models are far removed from the scripted expert. The best average success rate across three scenarios is 0.68; contested merge is statistically flat across models; and difficulty tiers separate cue-following from true wait-for-commitment behavior.

Source

Originally published at arxiv.org.

Related Articles

CD
Robos News Newsroom

Robos News covers markets, crypto and commodities for Asia & the Middle East — tier-1 desk research, AI-driven analysis, institutional-grade data. Tip our newsroom: [email protected]

Email the newsroom →
Disclaimer: This article is for informational purposes only and does not constitute investment advice. Data may be delayed up to 15 minutes. Past performance is not indicative of future results. Consult a licensed financial advisor before making investment decisions.

Related Stories

More from News →