[Paper] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

Published: 3 days ago (February 26, 2026 at 01:40 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.23335v1

Overview

The paper introduces the Asta Interaction Dataset, a massive, anonymized log of how researchers actually use AI‑powered literature discovery and question‑answering tools. By analyzing more than 200 K queries and interaction traces from a real‑world retrieval‑augmented generation (RAG) platform, the authors reveal how scientists treat these systems as collaborative partners rather than simple search engines. The findings give developers concrete clues about designing more useful AI research assistants.

Key Contributions

Large‑scale, real‑world dataset: >200 K user queries and interaction logs from two deployed AI research tools, released publicly for the community.
Query‑intent taxonomy: A fine‑grained classification (e.g., “drafting”, “gap‑identification”, “citation‑verification”) that captures the diverse purposes of AI‑assisted research.
Behavioral insights: Empirical evidence that researchers issue longer, more complex queries, treat generated text as persistent artifacts, and navigate citations in non‑linear ways.
Experience curve analysis: Demonstrates how query specificity and citation engagement evolve as users become more familiar with the tool.
Design recommendations: Concrete guidelines for building AI research assistants that support drafting, iterative refinement, and citation management.

Methodology

Data collection – The authors instrumented two production tools (a literature discovery UI and a scientific QA interface) built on a LLM‑backed RAG architecture. All user interactions (queries, clicks, scrolls, citation expansions, and session timestamps) were logged over several months.
Anonymization & preprocessing – Personal identifiers and sensitive content were stripped; queries were tokenized and normalized.
Taxonomy development – A mixed‑methods approach combined manual annotation of a random query sample with clustering of semantic embeddings to derive a 12‑category intent schema.
Quantitative analysis – Metrics such as query length, token diversity, session depth, citation click‑through rate, and “artifact revisitation” frequency were computed. Temporal trends were examined by segmenting users into novice, intermediate, and expert cohorts based on session count.
Statistical validation – Differences across cohorts and tool types were tested with ANOVA and post‑hoc Tukey tests, ensuring results are not artifacts of random variation.

Results & Findings

Finding	What it means
Average query length = 12.4 tokens (vs. ~5 tokens in traditional web search)	Researchers ask more detailed, multi‑sentence questions, expecting richer context from the AI.
~38 % of sessions involve “drafting” intents (e.g., asking the model to write an abstract or related work paragraph)	The AI is being used as a writing collaborator, not just a retrieval engine.
Citation‑click‑through rate = 62 %, and 27 % of users revisit the same generated answer across multiple sessions	Generated responses become “sticky” artifacts; users treat them as reference material worth revisiting.
Experienced users (≥10 sessions) issue 22 % more targeted queries (e.g., “compare method X vs Y on dataset Z”)	Familiarity leads to more precise prompting, but keyword‑style queries still persist.
Non‑linear navigation – 45 % of sessions involve jumping between answer sections and cited papers, then back to the answer	Users are iteratively refining understanding, using the AI as a hub that links to primary sources.
Persistent “gap‑identification” queries – 15 % of all queries ask the model to highlight missing literature or open problems	AI is being leveraged for research planning and hypothesis generation.

Practical Implications

Design for Drafting: UI should expose easy ways to export, edit, and version‑control AI‑generated text (e.g., markdown export, Git integration).
Citation Management Integration: Embed citation metadata directly into the answer UI, with one‑click import into reference managers (Zotero, Mendeley).
Session Persistence: Treat each answer as a first‑class artifact—allow bookmarking, tagging, and linking between answers to support the observed non‑linear workflow.
Prompt Guidance: Offer dynamic prompt templates that evolve with user expertise, nudging novices toward more targeted queries while still supporting exploratory, keyword‑style searches.
Evaluation Benchmarks: The released taxonomy and dataset give developers a realistic testbed for measuring “research‑assistant” performance beyond standard QA metrics (e.g., include citation relevance, draft quality, and user engagement).
Privacy‑by‑Design: Since the dataset required thorough anonymization, any production system should adopt similar safeguards when logging researcher interactions.

Limitations & Future Work

Domain bias: The data comes from a single RAG platform focused on life‑science literature, so patterns may differ in other fields (e.g., CS, humanities).
Self‑selection: Users who opted into the tool may be more tech‑savvy, potentially inflating the prevalence of advanced prompting behaviors.
Static analysis: The study captures snapshots of interaction; longitudinal studies over years could reveal deeper learning curves.
Future directions suggested by the authors include expanding the dataset to multi‑disciplinary corpora, incorporating eye‑tracking or think‑aloud protocols to better understand cognitive load, and testing adaptive UI components that respond to the identified usage phases (exploration → drafting → citation verification).

Authors

Dany Haddad
Dan Bareket
Joseph Chee Chang
Jay DeYoung
Jena D. Hwang
Uri Katz
Mark Polak
Sangho Suh
Harshit Surana
Aryeh Tiktinsky
Shriya Atmakuri
Jonathan Bragg
Mike D’Arcy
Sergey Feldman
Amal Hassan-Ali
Rubén Lozano
Bodhisattwa Prasad Majumder
Charles McGrady
Amanpreet Singh
Brooke Vlahos
Yoav Goldberg
Doug Downey

Paper Information

arXiv ID: 2602.23335v1
Categories: cs.HC, cs.AI, cs.IR
Published: February 26, 2026
PDF: Download PDF

[Paper] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Model Agreement via Anchoring

[Paper] SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

[Paper] A Dataset is Worth 1 MB

[Paper] SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport