[Paper] Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation
Source: arXiv - 2511.21517v1
Overview
The paper Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation digs into why speech‑translation (ST) systems sometimes mis‑gender speakers when converting English (which lacks grammatical gender for many words) into languages like Spanish, French or Italian that do. By probing three ST models, the authors uncover how training data, internal language‑model biases, and acoustic cues interact to shape gender decisions—offering the first detailed, model‑level explanation of this phenomenon.
Key Contributions
- Empirical analysis of gender assignment across three language pairs (en‑es, en‑fr, en‑it) in state‑of‑the‑art ST systems.
- Disentangling bias sources: shows that models inherit a masculine prevalence from data, but also rely on acoustic signals to override internal language‑model (ILM) preferences.
- Contrastive feature attribution on spectrograms that pinpoints which frequency bands the model uses to infer speaker gender.
- Discovery of a novel mechanism: the model links first‑person pronouns (“I”, “me”) to gendered nouns, leveraging distributed spectral information rather than just pitch.
- Open‑source tooling for reproducing the analysis (datasets, attribution scripts, and visualizations).
Methodology
- Data & Models – The authors train end‑to‑end ST models on publicly available corpora for English→Spanish, English→French, and English→Italian. Each model consists of an acoustic encoder, a decoder, and an internal language model (ILM) component that predicts target text without audio.
- Bias Probing – They create controlled test sets where the source utterance contains first‑person references and gender‑ambiguous nouns (e.g., “I am a doctor”). By swapping speaker gender in the audio (male vs. female voice) they can observe how the translation changes.
- Ablation of ILM – The ILM is run in isolation (no audio) to measure its raw gender bias, then combined with the acoustic encoder to see how much the audio can overturn it.
- Contrastive Feature Attribution – Using a gradient‑based attribution method on the spectrogram, they generate heatmaps that highlight which time‑frequency regions most influence the gender decision.
- Statistical Analysis – Gender accuracy is computed per language pair, and correlation analyses link attribution patterns to performance.
Results & Findings
| Aspect | What the authors found |
|---|---|
| Training data bias | Models learn a global masculine prevalence (≈ 60‑70 % masculine gender assignments) rather than memorizing per‑noun gender frequencies. |
| ILM bias | When run without audio, the ILM defaults to masculine forms for > 80 % of gender‑ambiguous nouns. |
| Acoustic override | Providing a female voice flips the gender in 45‑55 % of cases, showing the acoustic encoder can counteract ILM bias but not always fully. |
| Feature attribution | High‑accuracy models focus on a broad frequency band (≈ 300‑800 Hz) and on the timing of first‑person pronouns, indicating they use prosodic patterns linked to speaker identity rather than just pitch. |
| Cross‑language consistency | The same mechanisms appear across Spanish, French, and Italian, suggesting a language‑agnostic bias pattern in current ST architectures. |
Practical Implications
- Fairer user experiences – Understanding how gender is inferred lets product teams audit and mitigate mis‑gendering in voice assistants, real‑time captioning, and multilingual meeting transcription services.
- Model design – Developers can consider decoupling the ILM from the acoustic encoder (e.g., via bias‑regularization or gender‑balanced fine‑tuning) to reduce default masculine bias.
- Data collection – The study highlights the need for gender‑balanced speech corpora; simply adding more female speakers can help the acoustic encoder learn stronger gender cues.
- Explainability tools – The contrastive spectrogram attribution technique can be integrated into debugging pipelines to visualize why a particular translation chose a specific gendered form.
- Regulatory compliance – As privacy and anti‑bias regulations tighten, having a clear, reproducible analysis of gender decisions supports compliance audits for AI‑driven translation services.
Limitations & Future Work
- Scope of languages – The analysis is limited to three Romance languages; languages with richer gender systems (e.g., Slavic) may exhibit different dynamics.
- Binary gender focus – The study only distinguishes male vs. female voices, ignoring non‑binary or gender‑nonconforming speakers.
- Model family – Experiments use a single end‑to‑end transformer architecture; other ST paradigms (e.g., cascade pipelines) might behave differently.
- Real‑world noise – Test utterances are clean; background noise or overlapping speech could affect the acoustic cues the model relies on.
Future research directions include extending the attribution framework to multilingual, many‑to‑many ST models, exploring bias mitigation strategies (e.g., adversarial training), and broadening the gender spectrum considered during evaluation.
Authors
- Lina Conti
- Dennis Fucci
- Marco Gaido
- Matteo Negri
- Guillaume Wisniewski
- Luisa Bentivogli
Paper Information
- arXiv ID: 2511.21517v1
- Categories: cs.CL, cs.AI
- Published: November 26, 2025
- PDF: Download PDF