[Paper] Real-Time Multimodal Data Collection Using Smartwatches and Its Visualization in Education
Source: arXiv - 2512.02651v1
Overview
The paper introduces Watch‑DMLT, a real‑time data‑capture app for Fitbit Sense 2 smartwatches, and ViSeDOPS, a web‑based dashboard that visualizes the synchronized multimodal streams collected during classroom activities. By combining physiological, motion, gaze, video, and annotation data from up to 16 devices simultaneously, the authors demonstrate a scalable pipeline for Multimodal Learning Analytics (MLA) that can be deployed in everyday educational settings.
Key Contributions
- Watch‑DMLT: Open‑source Android app that streams heart‑rate, accelerometer, gyroscope, and optional gaze data from multiple smartwatches to a central server with sub‑second latency.
- ViSeDOPS: Interactive visualization suite (timeline, heat‑maps, video overlay) that aligns multimodal streams with teacher‑provided annotations, enabling rapid exploratory analysis.
- End‑to‑end deployment: Field trial with 65 university students (up to 16 concurrent watches) during oral presentations, proving the system’s robustness in a real classroom.
- Data schema & synchronization protocol: A lightweight JSON‑based format and NTP‑based clock alignment that keep all streams temporally coherent without heavy infrastructure.
- Open resources: Code, documentation, and a sample dataset released under an MIT‑compatible license to encourage replication and extension.
Methodology
- Hardware & Sensors – Participants wore Fitbit Sense 2 devices that expose heart‑rate (PPG), 3‑axis accelerometer, gyroscope, and, when paired with an external eye‑tracker, gaze coordinates.
- Software Stack –
- Watch‑DMLT runs on the watch, batches sensor readings into 200 ms packets, and pushes them over Bluetooth to a companion Android phone.
- The phone forwards packets via WebSocket to a cloud‑hosted Node.js server that timestamps each packet using NTP‑synchronised clocks.
- Annotation Layer – Instructors use a simple web form to tag events (e.g., “question asked”, “feedback given”). These timestamps are merged with the sensor streams on the server.
- Visualization – ViSeDOPS consumes the unified JSON log, rendering synchronized timelines, per‑student heat‑maps of motion intensity, and a split‑screen view that overlays video of the presentation with live physiological traces.
- Evaluation – The system’s latency, packet loss, and battery impact were measured across multiple sessions; qualitative feedback was gathered from students and the instructor about usability and insightfulness.
Results & Findings
- Low latency & high fidelity – Average end‑to‑end delay was ≈ 350 ms, with < 2 % packet loss even when 16 watches streamed concurrently.
- Battery endurance – A single 48‑hour charge sustained continuous data capture for a full 2‑hour class, confirming feasibility for typical lecture lengths.
- Insight generation – Visual analyses revealed clear physiological signatures (e.g., heart‑rate spikes, increased motion) aligned with moments of audience questioning and presenter stress.
- Scalability – The server handled up to 200 Hz aggregate sampling rates without degradation, suggesting the pipeline can support larger cohorts or richer sensor suites.
- User acceptance – 87 % of students reported the smartwatch was “non‑intrusive,” and the instructor highlighted the dashboard’s ability to pinpoint engagement bottlenecks in real time.
Practical Implications
- Real‑time feedback loops – Educators can monitor class‑wide affective states live, enabling adaptive interventions (e.g., pausing for clarification when stress spikes).
- Automated analytics pipelines – Developers can plug the open‑source Watch‑DMLT into existing Learning Management Systems (LMS) to enrich analytics dashboards with physiological context.
- Beyond education – The same architecture applies to corporate training, remote workshops, or any scenario where multimodal human‑computer interaction needs to be quantified at scale.
- Rapid prototyping – Because the system relies on commodity smartwatches and standard web technologies, teams can prototype new MLA studies without custom hardware or costly data‑center setups.
- Privacy‑by‑design – Data is anonymized on‑device before transmission, and the open schema makes it straightforward to integrate consent management tools.
Limitations & Future Work
- Sensor diversity – The current implementation is tied to Fitbit Sense 2; extending to other wearables (Apple Watch, Garmin) will require additional SDK wrappers.
- Gaze capture – Accurate eye‑tracking was only demonstrated with an external device; integrating on‑watch cameras or low‑cost eye‑trackers remains an open challenge.
- Long‑term studies – The paper reports a single‑semester deployment; longitudinal validation (multiple courses, varied demographics) is needed to assess generalizability.
- Automated inference – Future work could embed machine‑learning models on the server to automatically detect engagement patterns, reducing reliance on manual annotation.
- Edge processing – Offloading some preprocessing to the smartwatch (e.g., feature extraction) could further cut bandwidth and improve privacy.
Authors
- Alvaro Becerra
- Pablo Villegas
- Ruth Cobos
Paper Information
- arXiv ID: 2512.02651v1
- Categories: cs.HC, cs.CV, cs.SE
- Published: December 2, 2025
- PDF: Download PDF