[Paper] How Software Engineering Students Use LLMs to Write Research Papers: An Experience Report

Published: 1 week ago (June 3, 2026 at 01:19 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2606.05114v1

Overview

A recent experience report investigates how third‑year software‑engineering students employ large language models (LLMs) when writing short research papers for an empirical methods assignment. By requiring students to disclose their LLM usage, the authors were able to capture real‑world practices, challenges, and learning outcomes, offering a rare glimpse into AI‑augmented academic work in a software‑engineering curriculum.

Key Contributions

Empirical data on student LLM usage – 146 disclosure statements were collected and systematically analyzed.
A mixed categorization pipeline – LLM‑assisted initial tagging combined with manual verification to produce a reliable taxonomy of usage patterns.
Identification of common LLM‑supported activities – brainstorming, clarifying methodology, structuring findings, and polishing prose.
Insights into student concerns – misinformation, hallucinations, and the need for verification were repeatedly highlighted.
Pedagogical recommendations – concrete guidelines for integrating reflective LLM use into empirical software‑engineering courses.

Methodology

Course context – The study took place in a third‑year software architecture class where each student had to produce a short research paper using either a rapid review or a gray‑literature review approach.
Reflective disclosure – Students submitted a brief statement describing how they used an LLM (e.g., ChatGPT, Claude) throughout the assignment.
Data collection – 146 statements were gathered over a semester.
Cross‑analysis pipeline
- An LLM first auto‑tagged each statement with preliminary categories (e.g., “idea generation”, “citation checking”).
- Researchers manually inspected and refined the tags, merging overlapping categories and eliminating noise.
Thematic synthesis – The refined tags were grouped into higher‑level themes that describe the role of LLMs in the research‑writing workflow.

Results & Findings

Theme	Typical Student Activity	Notable Observations
Idea & Topic Generation	Prompting the LLM for possible research questions or keywords.	Students valued the speed of brainstorming but noted that suggestions often needed domain‑specific filtering.
Methodology Clarification	Asking the LLM to explain rapid‑review steps, inclusion criteria, or gray‑literature search strategies.	LLM explanations helped novices grasp concepts, yet some students discovered inaccuracies that required manual correction.
Organization & Synthesis	Using the LLM to outline sections, draft tables, or summarize extracted findings.	The AI accelerated structuring, but students reported occasional “hallucinated” data that had to be cross‑checked.
Writing & Polishing	Grammar fixes, re‑phrasing sentences, and improving readability.	Most students found this the most reliable use case, with measurable improvements in perceived writing quality.
Verification & Validation	Manually checking facts, citations, and generated code snippets.	A recurring concern: the need for a verification loop to avoid propagating false information.

Overall, students perceived LLMs as productivity boosters for low‑level writing tasks, while still relying heavily on their own expertise for critical analysis and validation.

Practical Implications

Tool‑Enhanced Curriculum Design – Instructors can embed structured LLM reflection activities (e.g., mandatory disclosure statements) to teach responsible AI use while harvesting valuable data for continuous improvement.
Rapid Prototyping of Literature Reviews – Teams can leverage LLMs for initial scoping and outline generation, cutting down the time spent on repetitive drafting.
Quality Assurance Workflows – The study underscores the necessity of a verification step; developers building LLM‑assisted authoring tools should integrate citation checking, source linking, and hallucination detection mechanisms.
Skill Development for Future Engineers – Familiarity with prompting, prompt engineering, and critical evaluation of AI‑generated content becomes a marketable competency in data‑driven software research and documentation.
Policy & Ethics Guidance – The disclosed concerns provide a baseline for institutional policies on AI‑assisted academic work, balancing innovation with academic integrity.

Limitations & Future Work

Single‑course, single‑institution scope – Findings may not generalize across different curricula, cultural contexts, or experience levels.
Self‑reported data – Students’ disclosures could be incomplete or biased toward socially desirable usage.
LLM version dependency – The study was conducted with specific LLMs available at the time; rapid model evolution may change usage patterns.

Future research directions suggested by the authors include longitudinal studies across multiple courses, automated detection of AI‑generated text in student submissions, and the design of scaffolded prompts that explicitly guide students toward verifiable, high‑quality outputs.

Authors

Ronnie de Souza Santos
Maria Teresa Baldassarre
Cleyton Magalhaes
Italo Santos

Paper Information

arXiv ID: 2606.05114v1
Categories: cs.SE
Published: June 3, 2026
PDF: Download PDF

[Paper] How Software Engineering Students Use LLMs to Write Research Papers: An Experience Report

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Agentic Very Much! Adoption of Coding Agent in New GitHub Projects

[Paper] Is US Defense Acquisition Ready to Acquire AI-Enabled Capabilities? Assessing the DoD Software Acquisition Pathway Through a Scenario-Based Policy Analysis

[Paper] On the Shoulders of Giants: Empowering Automated Smart Contract Auditing via the GiAnt Corpus

[Paper] QBugLM: An Agentic Benchmarking Framework for LLM-based Quantum Software Debugging