[Paper] Measuring Human Value Expression in Social Media Texts: Calibrated LLM Annotation and Encoder Transfer

Published: (June 9, 2026 at 11:55 AM EDT)
2 min read
Source: arXiv

Source: arXiv - 2606.11018v1

Overview

Measuring subjective constructs in naturally occurring social media text requires annotation procedures that are theoretically grounded, empirically validated, and transferable to an encoder model for scalable prediction. Using non-English social media posts annotated according to Schwartz’s theory of basic human values, we investigate how different LLMs, prompts, and instruction languages operationalize the expression of values in text. We argue that although texts may permit multiple plausible interpretations, theory-based value definitions can constrain interpretations and reduce spurious value attributions. Beyond precision, recall, and F1, we evaluate structural alignment between values, error structure, confidence-ambiguity relations, and annotation stability. We show that different LLMs produce different value interpretations. Iterative prompt calibration through error analysis reduces misattributions and improves alignment with expert annotations. We also derive targeted expert verification rules from recurrent error structures and use them during corpus annotation. Finally, we show that LLM annotations can be transferred to an encoder model through soft-label training, retaining theory-based value interpretations and information about uncertainty in value expression.

Key Contributions

This paper presents research in the following areas:

  • cs.CL

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.CL.

Authors

  • Maria Milkova
  • Maksim Rudnev

Paper Information

  • arXiv ID: 2606.11018v1
  • Categories: cs.CL
  • Published: June 9, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »