[Paper] Investigating Conversational Agents to Support Secondary School Students Learning CSP

Published: 2 days ago (April 17, 2026 at 12:22 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2604.16213v1

Overview

A recent study by Frazier, Damevski, and Pollock investigates how conversational agents—both off‑the‑shelf models like ChatGPT and purpose‑built chatbots—can help high‑school students tackle the AP Computer Science Principles (CSP) curriculum. By testing these agents in real classrooms, the authors shed light on whether AI‑driven dialogue can make the often‑overwhelming search for programming concepts more focused and engaging.

Key Contributions

Empirical comparison of general‑purpose generative chatbots vs. custom, fixed‑response bots for CSP learning.
In‑situ classroom deployment with 45 students across six sections, providing authentic usage data.
Framework for “exploratory search” in educational settings, outlining how students phrase queries, iterate, and refine understanding.
Metrics of effectiveness and engagement, including task success rates, time‑on‑task, and self‑reported satisfaction.
Design guidelines for building conversational agents that align with secondary‑school programming curricula.

Methodology

Toolset

General‑purpose agent: Access to a state‑of‑the‑art generative model (ChatGPT‑style) via a web interface.
Custom agent: A rule‑based chatbot pre‑loaded with CSP‑specific FAQs, code snippets, and concept explanations.

Procedure

Students were given a series of typical CSP learning tasks (e.g., “Explain the difference between abstraction and encapsulation,” “Write a simple loop that counts to 10”).
Each student interacted with one of the two agents for a fixed 20‑minute window, then switched to the other agent on a different task set.
Interaction logs captured query phrasing, follow‑up questions, and time spent.
Post‑session surveys measured perceived usefulness, clarity, and overall enjoyment.

Analysis

Quantitative: success rate (correct answer), number of clarification turns, and average time to solution.
Qualitative: thematic coding of open‑ended feedback to surface strengths and pain points of each bot type.

Results & Findings

Metric	General‑Purpose Bot (ChatGPT)	Custom Fixed‑Response Bot
Correct answer rate	78 %	62 %
Avg. turns per query	3.4	2.1
Avg. time to solution	4.2 min	5.1 min
Student satisfaction (1‑5)	4.2	3.6

Higher accuracy: The generative model produced more correct explanations, especially for open‑ended conceptual questions.
More dialogue: Students asked more follow‑up questions to the generative bot, indicating deeper engagement rather than a “one‑shot” answer.
Speed vs. depth trade‑off: Fixed‑response bots were quicker for straightforward fact‑lookup (e.g., syntax) but struggled with nuanced reasoning.
Engagement boost: Over 80 % of participants reported that talking to a chatbot felt “more interactive” than scrolling through static web pages.

Practical Implications

Supplementary tutoring tool – Schools can integrate a generative chatbot into LMS platforms to give students instant, conversational help without waiting for teacher office hours.
Personalized scaffolding – Because the model can adapt its explanations to a student’s prior queries, it can serve as a low‑cost “personal tutor” for diverse learning styles.
Curriculum‑aware bot design – The study suggests a hybrid approach—pair a robust generative core with a curated knowledge base of CSP‑specific resources—to balance accuracy and relevance.
Reduced search friction – By turning vague “I don’t understand loops” prompts into guided dialogues, developers can lower the cognitive load associated with traditional web search.
Data‑driven content updates – Interaction logs provide teachers insight into which concepts cause the most confusion, informing targeted lesson planning.

Limitations & Future Work

Sample size & diversity – The study involved 45 students from a single school district; broader demographics may reveal different usage patterns.
Reliance on internet connectivity – Real‑time generative models need stable bandwidth, which can be a barrier in under‑resourced classrooms.
Potential for misinformation – While generative bots performed well, they occasionally produced plausible‑but‑incorrect answers, necessitating teacher oversight.
Future directions – The authors propose extending the evaluation to longitudinal studies (tracking semester‑long performance), exploring multimodal agents (voice + text), and integrating automated correctness checks to flag erroneous responses in real time.

Authors

Matthew Frazier
Kostadin Damevski
Lori Pollock

Paper Information

arXiv ID: 2604.16213v1
Categories: cs.HC, cs.SE
Published: April 17, 2026
PDF: Download PDF