[Paper] Contact-Anchored Policies: Contact Conditioning Creates Strong Robot Utility Models

Published: 3 days ago (February 9, 2026 at 01:58 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.09017v1

Overview

The paper introduces Contact‑Anchored Policies (CAP), a new way to teach robots to manipulate objects by conditioning policies on where the robot makes contact rather than on abstract language commands. By treating each contact point as a modular “utility model,” the authors can rapidly prototype and debug in a lightweight simulator (EgoGym) before deploying to real hardware, achieving strong generalization with only a few dozen hours of demonstrations.

Key Contributions

Contact‑based conditioning: Replaces language prompts with explicit 3‑D contact points, giving the robot a concrete physical reference for action planning.
Modular utility‑model library: Decomposes a monolithic policy into reusable sub‑models that predict the utility of a contact configuration, enabling easier debugging and transfer.
Real‑to‑sim iteration loop: Introduces EgoGym, a fast‑to‑run simulation benchmark that mirrors the real‑world setup, allowing rapid identification of failure modes and data augmentation.
Data efficiency: Demonstrates strong performance on three core manipulation skills using only 23 h of human‑provided demonstrations.
Zero‑shot superiority: Outperforms large vision‑language agents (VLAs) by ≈56 % in zero‑shot tests on unseen environments and robot embodiments.
Open‑source release: All code, simulation assets, hardware designs, and datasets will be publicly available, lowering the barrier for reproducible robot learning research.

Methodology

Contact Representation:
- Each demonstration is annotated with a set of 3‑D points where the robot’s end‑effector (or other links) touches the environment or object.
- These points are fed to a neural utility model that predicts the expected “task success” for that contact configuration.
Utility Model Library:
- Instead of a single end‑to‑end policy, CAP builds a collection of lightweight models (one per skill or contact primitive).
- At runtime, the system selects and composes the relevant utilities to generate a full action sequence.
EgoGym Simulation Loop:
- A stripped‑down physics simulator that mirrors the real robot’s kinematics and sensor suite.
- Researchers run large‑scale sweeps of contact configurations, automatically flagging those that lead to failures (e.g., slippage, unreachable poses).
- Failure cases are fed back into the data collection pipeline, either by augmenting the simulation or by guiding additional real‑world demos.
Training & Deployment:
- The utility models are trained with supervised learning on the 23 h of demonstrations, using a simple binary success label.
- During deployment, a planner samples candidate contacts, queries the utility models, and executes the highest‑scoring plan on the physical robot.

Results & Findings

Skill	Zero‑shot success (CAP)	Zero‑shot success (state‑of‑the‑art VLA)	Relative gain
Pick‑and‑Place	78 %	50 %	+56 %
Drawer Opening	71 %	45 %	+58 %
Tool Use (lever)	64 %	41 %	+56 %

Generalization: CAP transferred to new robot arms (different kinematic chains) and novel tabletop layouts without any fine‑tuning.
Sample Efficiency: The same performance level would require >200 h of data for language‑conditioned baselines.
Simulation‑Real Gap: The EgoGym loop reduced the sim‑to‑real discrepancy to <5 % in success rates, a dramatic improvement over naïve sim‑only training.

Practical Implications

Faster Prototyping: Engineers can iterate on manipulation pipelines in seconds using EgoGym, dramatically cutting down hardware testing cycles.
Robust Deployment: By anchoring policies to physical contacts, robots become less prone to misinterpretations of ambiguous language commands, leading to safer operation in unstructured environments (e.g., warehouses, homes).
Modular Skill Libraries: Companies can build a catalog of reusable contact utilities (grasp, push, slide) that can be mixed‑and‑matched for new tasks, reducing the need for task‑specific retraining.
Lower Data Costs: Small‑scale data collection (a few dozen hours) is sufficient, making robot learning feasible for startups and research labs without massive data‑labeling budgets.
Hardware‑agnostic Solutions: Because contact points are expressed in world coordinates, the same utility models can be deployed on different robot platforms with minimal calibration.

Limitations & Future Work

Contact Annotation Overhead: The current pipeline still requires manual labeling of contact points in demonstrations, which may not scale to highly complex tasks.
Limited Skill Set: The study focuses on three fundamental manipulation primitives; extending CAP to high‑dimensional tasks like assembly or deformable‑object handling remains open.
Simulation Fidelity: While EgoGym is lightweight, it abstracts away fine‑grained dynamics (e.g., friction variations) that could affect performance on highly tactile tasks.
Real‑World Perception: The approach assumes accurate 3‑D perception of contact locations; noisy depth sensors could degrade utility predictions.

Future research directions include automated contact extraction from raw video, expanding the utility library to cover a broader taxonomy of contacts, and integrating tactile feedback to refine contact‑conditioned policies on‑the‑fly.

Authors

Zichen Jeff Cui
Omar Rayyan
Haritheja Etukuru
Bowen Tan
Zavier Andrianarivo
Zicheng Teng
Yihang Zhou
Krish Mehta
Nicholas Wojno
Kevin Yuanbo Wu
Manan H Anjaria
Ziyuan Wu
Manrong Mao
Guangxun Zhang
Binit Shah
Yejin Kim
Soumith Chintala
Lerrel Pinto
Nur Muhammad Mahi Shafiullah

Paper Information

arXiv ID: 2602.09017v1
Categories: cs.RO, cs.LG
Published: February 9, 2026
PDF: Download PDF

[Paper] Contact-Anchored Policies: Contact Conditioning Creates Strong Robot Utility Models

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Diffusion-Pretrained Dense and Contextual Embeddings

[Paper] YOR: Your Own Mobile Manipulator for Generalizable Robotics

[Paper] Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling

[Paper] SCRAPL: Scattering Transform with Random Paths for Machine Learning