[Paper] Learning Factors in AI-Augmented Education: A Comparative Study of Middle and High School Students

Published: 1 month ago (December 24, 2025 at 10:43 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.21246v1

Overview

A new study by Ebli, Raimondi, and Gabbrielli examines how middle‑school and high‑school students perceive AI‑driven learning tools during programming lessons. By measuring four “learning factors” – experience, clarity, comfort, and motivation – the researchers uncover striking age‑related differences in how these factors interrelate, offering fresh guidance for educators and product teams building AI‑augmented classroom tech.

Key Contributions

Cross‑age comparative analysis of AI‑mediated learning, filling a gap that previously focused almost exclusively on university settings.
Identification of four core perception dimensions (experience, clarity, comfort, motivation) that shape students’ overall evaluation of AI tools.
Empirical evidence of divergent dimensional structures: middle‑schoolers show tightly coupled perceptions, while high‑schoolers treat each factor independently.
Multimethod quantitative pipeline that blends correlation matrices with text‑mining of open‑ended student feedback, demonstrating a reproducible approach for educational data mining.
Actionable framework for tailoring AI integration strategies to developmental stages, paving the way for age‑aware adaptive learning systems.

Methodology

Setting & Participants – Real‑world programming classes in two schools: one cohort of 7th‑8th graders (≈200 students) and one cohort of 10th‑12th graders (≈180 students).
AI Tool – A conversational coding assistant that offers hints, error explanations, and code suggestions during lab exercises.
Data Collection – After each session, students completed a short Likert‑scale survey covering the four factors plus a free‑text comment box.
Analysis Pipeline
- Correlation analysis: Pearson’s r computed between every pair of factors within each age group.
- Text mining: Tokenization, TF‑IDF weighting, and topic modeling (LDA) applied to the open‑ended responses to validate the quantitative patterns.
- Statistical testing: Fisher’s r‑to‑z transformation used to compare correlation strengths across groups.

The approach is deliberately lightweight: no deep neural models, just classic statistics and natural‑language processing that any data‑savvy developer can replicate.

Results & Findings

Factor Pair	Middle School (r)	High School (r)
Experience ↔ Clarity	0.71 (p < 0.001)	0.12 (ns)
Experience ↔ Comfort	0.68 (p < 0.001)	0.05 (ns)
Experience ↔ Motivation	0.64 (p < 0.001)	0.09 (ns)
Clarity ↔ Comfort	0.73 (p < 0.001)	0.08 (ns)
Clarity ↔ Motivation	0.66 (p < 0.001)	0.11 (ns)
Comfort ↔ Motivation	0.70 (p < 0.001)	0.07 (ns)

Middle‑schoolers: All six pairwise correlations are strong and statistically significant, indicating a holistic perception—if a student feels comfortable, they also tend to rate clarity, experience, and motivation highly.
High‑schoolers: Correlations hover around zero, suggesting independent evaluation of each dimension. Text‑mining revealed distinct vocabularies: younger students used generic positive adjectives (“fun”, “easy”), while older students mentioned specific tool features (“debug suggestions”, “syntax highlighting”) tied to individual factors.

These patterns persisted across multiple class sessions, reinforcing the robustness of the age‑related split.

Practical Implications

Adaptive UI/UX design – For younger learners, a single “satisfaction” gauge may suffice; for older students, dashboards should expose granular metrics (e.g., separate sliders for clarity vs. motivation) to capture nuanced feedback.
Personalized tutoring bots – Middle‑school bots can safely assume that boosting one factor (e.g., clearer explanations) will lift overall satisfaction, whereas high‑school bots should target each factor individually (e.g., motivational nudges separate from comfort adjustments).
Teacher analytics – Educators can use aggregated factor scores to spot specific pain points. A dip in “comfort” among high‑schoolers, for instance, signals a need to streamline the AI’s interaction flow without assuming it will automatically improve motivation.
Product road‑mapping – Development teams can prioritize features based on age group: for K‑12 platforms, invest early in holistic onboarding experiences for younger cohorts, then shift to fine‑grained, domain‑specific enhancements for older students.
Policy & curriculum – School districts can tailor AI‑integration guidelines, recommending broader, confidence‑building activities for middle schools and more autonomous, self‑regulation tools for high schools.

Limitations & Future Work

Sample scope – The study involved only two schools in a single geographic region; broader cultural contexts may exhibit different factor dynamics.
Single AI tool – Findings are tied to a conversational coding assistant; other AI modalities (e.g., adaptive quizzes, visual tutors) might produce alternative correlation structures.
Cross‑sectional design – Longitudinal tracking could reveal how individual students transition from holistic to differentiated perception as they age.
Future directions – Extending the framework to include affective signals (e.g., facial expression, physiological data) and testing intervention strategies that deliberately manipulate one factor to observe spill‑over effects on the others.

By spotlighting the developmental nuances of AI‑augmented learning, this research equips developers, educators, and policymakers with evidence‑based levers to build more responsive, age‑appropriate educational technologies.

Authors

Gaia Ebli
Bianca Raimondi
Maurizio Gabbrielli

Paper Information

arXiv ID: 2512.21246v1
Categories: cs.HC, cs.AI
Published: December 24, 2025
PDF: Download PDF

[Paper] Learning Factors in AI-Augmented Education: A Comparative Study of Middle and High School Students

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Agentic Structured Graph Traversal for Root Cause Analysis of Code-related Incidents in Cloud Applications

[Paper] Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

[Paper] Explainable Multimodal Regression via Information Decomposition

[Paper] A2P-Vis: an Analyzer-to-Presenter Agentic Pipeline for Visual Insights Generation and Reporting