[Paper] Real-Time Multimodal Data Collection Using Smartwatches and Its Visualization in Education

Published: (December 2, 2025 at 06:12 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.02651v1

Overview

The paper introduces Watch‑DMLT, a real‑time data‑capture app for Fitbit Sense 2 smartwatches, and ViSeDOPS, a web‑based dashboard that visualizes the synchronized multimodal streams collected during classroom activities. By combining physiological, motion, gaze, video, and annotation data from up to 16 devices simultaneously, the authors demonstrate a scalable pipeline for Multimodal Learning Analytics (MLA) that can be deployed in everyday educational settings.

Key Contributions

  • Watch‑DMLT: Open‑source Android app that streams heart‑rate, accelerometer, gyroscope, and optional gaze data from multiple smartwatches to a central server with sub‑second latency.
  • ViSeDOPS: Interactive visualization suite (timeline, heat‑maps, video overlay) that aligns multimodal streams with teacher‑provided annotations, enabling rapid exploratory analysis.
  • End‑to‑end deployment: Field trial with 65 university students (up to 16 concurrent watches) during oral presentations, proving the system’s robustness in a real classroom.
  • Data schema & synchronization protocol: A lightweight JSON‑based format and NTP‑based clock alignment that keep all streams temporally coherent without heavy infrastructure.
  • Open resources: Code, documentation, and a sample dataset released under an MIT‑compatible license to encourage replication and extension.

Methodology

  1. Hardware & Sensors – Participants wore Fitbit Sense 2 devices that expose heart‑rate (PPG), 3‑axis accelerometer, gyroscope, and, when paired with an external eye‑tracker, gaze coordinates.
  2. Software Stack
    • Watch‑DMLT runs on the watch, batches sensor readings into 200 ms packets, and pushes them over Bluetooth to a companion Android phone.
    • The phone forwards packets via WebSocket to a cloud‑hosted Node.js server that timestamps each packet using NTP‑synchronised clocks.
  3. Annotation Layer – Instructors use a simple web form to tag events (e.g., “question asked”, “feedback given”). These timestamps are merged with the sensor streams on the server.
  4. Visualization – ViSeDOPS consumes the unified JSON log, rendering synchronized timelines, per‑student heat‑maps of motion intensity, and a split‑screen view that overlays video of the presentation with live physiological traces.
  5. Evaluation – The system’s latency, packet loss, and battery impact were measured across multiple sessions; qualitative feedback was gathered from students and the instructor about usability and insightfulness.

Results & Findings

  • Low latency & high fidelity – Average end‑to‑end delay was ≈ 350 ms, with < 2 % packet loss even when 16 watches streamed concurrently.
  • Battery endurance – A single 48‑hour charge sustained continuous data capture for a full 2‑hour class, confirming feasibility for typical lecture lengths.
  • Insight generation – Visual analyses revealed clear physiological signatures (e.g., heart‑rate spikes, increased motion) aligned with moments of audience questioning and presenter stress.
  • Scalability – The server handled up to 200 Hz aggregate sampling rates without degradation, suggesting the pipeline can support larger cohorts or richer sensor suites.
  • User acceptance – 87 % of students reported the smartwatch was “non‑intrusive,” and the instructor highlighted the dashboard’s ability to pinpoint engagement bottlenecks in real time.

Practical Implications

  • Real‑time feedback loops – Educators can monitor class‑wide affective states live, enabling adaptive interventions (e.g., pausing for clarification when stress spikes).
  • Automated analytics pipelines – Developers can plug the open‑source Watch‑DMLT into existing Learning Management Systems (LMS) to enrich analytics dashboards with physiological context.
  • Beyond education – The same architecture applies to corporate training, remote workshops, or any scenario where multimodal human‑computer interaction needs to be quantified at scale.
  • Rapid prototyping – Because the system relies on commodity smartwatches and standard web technologies, teams can prototype new MLA studies without custom hardware or costly data‑center setups.
  • Privacy‑by‑design – Data is anonymized on‑device before transmission, and the open schema makes it straightforward to integrate consent management tools.

Limitations & Future Work

  • Sensor diversity – The current implementation is tied to Fitbit Sense 2; extending to other wearables (Apple Watch, Garmin) will require additional SDK wrappers.
  • Gaze capture – Accurate eye‑tracking was only demonstrated with an external device; integrating on‑watch cameras or low‑cost eye‑trackers remains an open challenge.
  • Long‑term studies – The paper reports a single‑semester deployment; longitudinal validation (multiple courses, varied demographics) is needed to assess generalizability.
  • Automated inference – Future work could embed machine‑learning models on the server to automatically detect engagement patterns, reducing reliance on manual annotation.
  • Edge processing – Offloading some preprocessing to the smartwatch (e.g., feature extraction) could further cut bandwidth and improve privacy.

Authors

  • Alvaro Becerra
  • Pablo Villegas
  • Ruth Cobos

Paper Information

  • arXiv ID: 2512.02651v1
  • Categories: cs.HC, cs.CV, cs.SE
  • Published: December 2, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »