[Paper] Bug Detective and Quality Coach: Developers' Mental Models of AI-Assisted IDE Tools

Published: 2 months ago (November 26, 2025 at 04:28 AM EST)

3 min read

Source: arXiv

Source: arXiv - 2511.21197v1

Overview

The paper Bug Detective and Quality Coach investigates how developers think about AI‑assisted features inside their IDEs—specifically tools that flag bugs and assess code readability. By surfacing developers’ mental models, the authors reveal why trust, control, and adoption of these helpers often hinge on subtle design choices rather than raw technical performance.

Key Contributions

Empirical insight: Six co‑design workshops with 58 professional developers uncovered two dominant mental models—bug detectives (critical‑issue alerts) and quality coaches (personalized readability guidance).
Design taxonomy: A set of concrete design principles for Human‑Centered AI in IDEs, balancing disruption vs. support, brevity vs. depth, and automation vs. agency.
Trust factors: Identification of the three pillars that drive trust for both tool types—clear explanations, appropriate timing, and user‑controlled interaction.
Methodological blueprint: Demonstrates a scalable workshop‑based approach for eliciting mental models of AI tools from practitioners.

Methodology

The researchers ran six co‑design workshops (≈2 hours each) with developers from a mix of industries and experience levels. Participants were asked to:

Sketch how they imagined an ideal AI bug‑detector or readability coach.
Discuss scenarios where such tools would help or hinder their workflow.
Prioritize features (e.g., explanation detail, notification timing, configurability).

The sessions were recorded, transcribed, and analyzed using thematic coding to surface recurring concepts and divergent expectations. This qualitative approach kept the focus on mental models—the internal representations developers hold about how the AI works and what it should do.

Results & Findings

Aspect	Bug‑Detection Tools (“Bug Detectives”)	Readability Tools (“Quality Coaches”)
Core role	Warn only about critical defects; act as a safety net.	Offer continuous, contextual advice to improve style and maintainability.
Desired output	Concise, actionable alerts with confidence scores.	Progressive, personalized suggestions that adapt to the developer’s style.
Trust drivers	Transparent reasoning, clear severity ranking, ability to dismiss or snooze alerts.	Explainable rationale, timing that aligns with coding flow, fine‑grained control over suggestion granularity.
User control	“Turn on/off” per file/project; set severity thresholds.	Configurable coaching style (e.g., strict vs. lenient), ability to accept/reject suggestions individually.
Feedback loop	Immediate feedback on false positives improves trust.	Long‑term metrics (e.g., reduced cyclomatic complexity) reinforce perceived value.

The authors distilled seven design principles, such as “Explain before you act,” “Let the developer stay in the driver’s seat,” and “Surface only what matters now.” These principles aim to prevent AI from becoming a noisy distraction while still delivering high‑value assistance.

Practical Implications

IDE vendors can redesign their AI extensions to adopt the detective/coach metaphors, making UI language and visual cues align with developers’ expectations.
Tool builders should prioritize explainability (e.g., inline rationale, confidence levels) and configurability (severity thresholds, coaching intensity) to boost adoption.
Team leads can set policies that let developers calibrate AI assistance per project, reducing the “one‑size‑fits‑all” friction that often leads to tool abandonment.
Continuous integration pipelines could integrate the “bug detective” mode to surface only show‑stopper issues, while the “quality coach” could be hooked into code‑review bots that provide style suggestions over time.
Developer onboarding: New hires can be introduced to AI helpers as mentors rather than gatekeepers, smoothing the learning curve and fostering trust early on.

Limitations & Future Work

Sample bias: All participants were recruited from a limited set of companies and may not represent the full spectrum of developer cultures (e.g., open‑source contributors, junior programmers).
Workshop scope: The co‑design setting captures idealized expectations; real‑world usage may reveal additional friction points.
Tool diversity: The study focused on generic bug‑detection and readability features; extending the framework to other AI‑assisted tasks (e.g., test generation, refactoring) remains open.

Future research directions include longitudinal field studies to validate whether the proposed design principles actually improve trust and productivity, and expanding the mental‑model framework to cover emerging AI capabilities like code synthesis and automated debugging.

Authors

Paolo Buono
Mary Cerullo
Stefano Cirillo
Giuseppe Desolda
Francesco Greco
Emanuela Guglielmi
Grazia Margarella
Giuseppe Polese
Simone Scalabrino
Cesare Tucci

Paper Information

arXiv ID: 2511.21197v1
Categories: cs.SE, cs.HC
Published: November 26, 2025
PDF: Download PDF

[Paper] Bug Detective and Quality Coach: Developers' Mental Models of AI-Assisted IDE Tools

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

Building with Gemini 3 in Jules

Introducing Code Wiki: Accelerating your code understanding

What’s new at Stack Overflow: December 2025

Be Like Clippy