[Paper] Auto-Generating Personas from User Reviews in VR App Stores

Published: (March 5, 2026 at 04:25 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2603.04985v1

Overview

The paper presents an auto‑generated persona system that mines user reviews from VR app stores to create realistic user archetypes, with a focus on accessibility needs. By integrating these personas into a university VR development course, the authors show that students can more quickly develop empathy and uncover hidden accessibility requirements—an area that’s traditionally under‑addressed in VR design.

Key Contributions

  • Automated persona generation pipeline that extracts, clusters, and summarizes VR app store reviews into accessibility‑focused user profiles.
  • Empirical evaluation in a VR software‑engineering course, demonstrating measurable gains in students’ ability to identify and discuss accessibility requirements.
  • Design guidelines for incorporating generated personas into VR development workflows and classroom settings.
  • Open‑source artifacts (code, datasets, and persona templates) released for replication and extension by researchers and practitioners.

Methodology

  1. Data Collection – The team scraped public reviews from major VR app stores (e.g., SteamVR, Oculus Store), focusing on comments that mention usability, comfort, motion sickness, and other accessibility concerns.
  2. Pre‑processing & Filtering – Natural‑language processing (NLP) techniques (tokenization, stop‑word removal, lemmatization) were applied, followed by a rule‑based filter to keep only reviews with accessibility‑related keywords.
  3. Clustering – Using a combination of TF‑IDF vectors and the DBSCAN algorithm, the system groups similar reviews, each cluster representing a potential user segment.
  4. Persona Synthesis – For each cluster, the pipeline extracts salient attributes (e.g., age, disability, hardware constraints) and generates a narrative persona (name, background, goals, pain points) via a template‑driven language model.
  5. Course Integration – The generated personas were handed to VR development teams (students) who used them during requirement‑elicitation workshops and design reviews.
  6. Evaluation – Pre‑ and post‑workshop surveys, plus qualitative analysis of design artifacts, measured empathy levels, the number of accessibility requirements captured, and perceived usefulness of the personas.

Results & Findings

  • Empathy boost: Students reported a 34 % increase in self‑assessed empathy toward users with accessibility needs after using the auto‑generated personas.
  • Requirement coverage: Teams identified 2.7× more accessibility‑related requirements compared with a control group that used textbook personas.
  • Time efficiency: Persona creation time dropped from an average of 3–4 hours (manual) to under 15 minutes with the automated pipeline.
  • Positive feedback: 87 % of participants found the personas “realistic” and “actionable,” citing concrete hardware constraints (e.g., limited controller reach) that they had previously overlooked.

Practical Implications

  • Rapid onboarding: VR startups can instantly generate user archetypes from existing app store data, accelerating early‑stage accessibility audits without hiring UX researchers.
  • Continuous improvement: As new reviews flow in, the system can refresh personas, keeping design teams aligned with evolving user needs.
  • Tool integration: The open‑source pipeline can be embedded into CI/CD pipelines or design‑system tooling (e.g., Figma plugins) to surface accessibility insights during sprint planning.
  • Education & training: Coding bootcamps and corporate VR training programs can adopt the persona generator to teach empathy‑driven design without extensive manual effort.
  • Regulatory compliance: By surfacing latent accessibility issues early, teams can better meet standards such as WCAG‑VR extensions or platform‑specific accessibility guidelines (e.g., Oculus Accessibility Toolkit).

Limitations & Future Work

  • Review bias: The system relies on publicly posted reviews, which may under‑represent certain disability groups (e.g., users with severe visual impairments who are less likely to leave text reviews).
  • Language scope: The current implementation processes only English reviews; extending to multilingual corpora is needed for global VR markets.
  • Granularity of personas: While the generated personas capture high‑level traits, they may miss nuanced contextual factors (e.g., cultural preferences) that affect accessibility.
  • Scalability to other domains: Future work will explore applying the pipeline to AR, mixed reality, and non‑immersive platforms, as well as integrating richer multimodal signals (e.g., video demos, telemetry data).

By automating the creation of accessibility‑focused personas directly from the voice of VR users, this research offers a pragmatic bridge between user‑centered design theory and the fast‑paced realities of VR development. Developers looking to embed empathy and compliance into their pipelines now have a concrete, open‑source tool to get started.

Authors

  • Yi Wang
  • Kexin Cheng
  • Xiao Liu
  • Chetan Arora
  • John Grundy
  • Thuong Hoang
  • Henry Been-Lirn Duh

Paper Information

  • arXiv ID: 2603.04985v1
  • Categories: cs.HC, cs.SE
  • Published: March 5, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »