[Paper] Unsupervised Skill Discovery for Agentic Data Analysis

Published: 6 days ago (June 4, 2026 at 01:20 PM EDT)

2 min read

Source: arXiv

Source: arXiv - 2606.06416v1

Overview

Inference-time skill augmentation provides a lightweight way to improve data‑analytic agents by injecting reusable procedural knowledge without updating model parameters. However, discovering effective skills for data analysis remains challenging, as reliable supervision is expensive and success criteria vary across analytical formats. This raises the key question of how to discover reusable data‑analysis skills from unlabeled exploration alone.

We propose DataCOPE, an unsupervised verifier‑guided skill discovery framework for data‑analytic agents. DataCOPE derives verifier signals from the exploration trajectories and uses them to characterize relative quality or agreement among trajectories. It iteratively coordinates:

a Data‑Analytic Agent for trajectory generation,
an Unsupervised Verifier for signal extraction, and
a Skill Manager for contrastive skill distillation.

For report‑style analysis, we instantiate the verifier as an Adaptive Checklist Verifier that derives task‑specific criteria, scores reports by verifiable coverage, and iteratively refines the checklist. For reasoning‑style analysis, we instantiate it as an Answer Agreement Verifier that groups trajectories by answer agreement and uses self‑consistency as an auxiliary signal.

We evaluate DataCOPE on report‑style analysis from Deep Data Research and reasoning‑style analysis from DABStep. Across both settings, DataCOPE consistently improves held‑out performance over baselines. Averaged across four model settings, DataCOPE improves the mean score by 9.71 % on report‑style tasks and 32.30 % on reasoning‑style tasks.

Key Contributions

cs.AI – Artificial intelligence
cs.CL – Computation and language
cs.LG – Machine learning
cs.MA – Multi‑agent systems

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.AI.

Authors

Zhisong Qiu
Kangqi Song
Shengwei Tang
Shuofei Qiao
Lei Liang
Huajun Chen
Shumin Deng

Paper Information

arXiv ID: 2606.06416v1
Categories: cs.AI, cs.CL, cs.LG, cs.MA
Published: June 4, 2026
PDF: Download PDF

[Paper] Unsupervised Skill Discovery for Agentic Data Analysis

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] How reliable are LLMs when it comes to playing dice?

[Paper] MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

[Paper] Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification

[Paper] TEVI: Text-Conditioned Editing of Visual Representations via Sparse Autoencoders for Improved Vision-Language Alignment