[Paper] Quantifying Emotional Tone in Tolkien's The Hobbit: Dialogue Sentiment Analysis with RegEx, NRC-VAD, and Python

Published: (December 11, 2025 at 12:58 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.10865v1

Overview

Lilian Qiu’s paper applies a straightforward Python pipeline to measure the emotional tone of every spoken line in The Hobbit. By extracting dialogue with regular expressions and scoring it with the NRC‑VAD (Valence‑Arousal‑Dominance) lexicon, the study maps how Tolkien’s narrative mood shifts over the course of the novel. The findings show a predominantly upbeat, calm story arc that gradually gains a sense of agency—insights that illustrate how lightweight NLP tools can reveal literary rhythm.

Key Contributions

  • Automated dialogue extraction from a classic novel using only regular‑expression patterns.
  • Sentiment scoring with NRC‑VAD, providing three interpretable dimensions (valence, arousal, dominance) rather than a single polarity label.
  • Temporal emotional trajectory visualizations (line‑by‑line graphs, moving averages, and word clouds) that make the novel’s mood dynamics easy to explore.
  • Demonstration of a reproducible, low‑dependency workflow (Python 3, re, pandas, matplotlib, seaborn) that can be adapted to any prose work with quoted speech.

Methodology

  1. Data acquisition – The full text of The Hobbit (public‑domain version) was loaded into Python.
  2. Dialogue detection – A regular‑expression pattern (\".*?\") captured everything between quotation marks, handling edge cases such as nested quotes and line breaks.
  3. Pre‑processing – Tokens were lower‑cased, punctuation stripped, and stop‑words removed to align with the NRC‑VAD lexicon format.
  4. Lexicon lookup – Each token was matched to its VAD scores (0‑1 scale). Tokens absent from the lexicon were ignored.
  5. Aggregation – For each dialogue line, the mean valence, arousal, and dominance scores were computed.
  6. Temporal smoothing – A rolling window (size = 10 lines) produced smooth curves that highlight longer‑range emotional trends.
  7. Visualization – Line graphs, heatmaps, and word clouds (weighted by VAD values) were generated with matplotlib/seaborn.

The entire pipeline runs in under a minute on a standard laptop, requiring no heavy machine‑learning models.

Results & Findings

DimensionOverall TrendInterpretation
Valence (positivity)High average (~0.68) with slight upward driftThe dialogue stays upbeat; moments of danger are quickly offset by humor or camaraderie.
Arousal (excitement)Low average (~0.34) and relatively flatTolkien’s prose maintains a calm narrative pace; spikes correspond to battle scenes but are brief.
Dominance (sense of control)Gradual increase from ~0.45 to ~0.58As Bilbo’s journey progresses, characters (especially Bilbo) exhibit growing agency and confidence.

Visualizations show periodic “valence dips” aligned with encounters with trolls, goblins, or Smaug, followed by rapid recoveries. Word clouds reveal clusters of high‑valence words (“cheer”, “laugh”, “friend”) and low‑arousal terms (“quiet”, “still”).

Practical Implications

  • Rapid sentiment profiling for any narrative – Developers building reading‑assist tools, game dialogue systems, or interactive fiction can adopt this lightweight pipeline to gauge emotional pacing without training custom models.
  • Content moderation & age‑appropriate filtering – By quantifying valence and arousal, platforms can flag sections that deviate sharply from a baseline (e.g., sudden spikes of aggression).
  • Narrative design analytics – Game writers and scriptwriters can visualize how dialogue emotional arcs evolve, helping them balance tension and relief in story beats.
  • Educational tech – Language‑learning apps could surface “emotionally rich” sentences for learners, encouraging deeper engagement with literary texts.
  • Open‑source reproducibility – The code uses only standard libraries; it can be packaged as a CLI tool or Jupyter notebook for quick integration into existing pipelines.

Limitations & Future Work

  • Lexicon coverage – NRC‑VAD lacks many archaic or fantasy‑specific terms (e.g., “gollum”, “smaug”), which forces the analysis to ignore those tokens, potentially biasing scores.
  • Contextual nuance – VAD scores are static; sarcasm, irony, or multi‑word expressions (e.g., “not at all”) are not captured.
  • Dialogue‑only focus – Narrative prose and internal monologue are excluded, so the emotional picture is incomplete.
  • Future directions – Incorporating contextual embeddings (e.g., BERT‑based sentiment models) to handle out‑of‑vocab words, expanding the lexicon with domain‑specific entries, and extending the pipeline to multi‑character interaction graphs (who talks to whom) would deepen the analysis.

Lilian Qiu’s work shows that even a few lines of Python can turn a beloved classic into a data‑driven case study, opening the door for developers to bring the same level of emotional insight to their own text‑heavy products.

Authors

  • Lilin Qiu

Paper Information

  • arXiv ID: 2512.10865v1
  • Categories: cs.CL
  • Published: December 11, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »