[Paper] When Surface Form Changes Moderation Decisions: A Paired Study of Code-Mixed Workflow Instability

Published: 6 days ago (June 3, 2026 at 11:34 PM EDT)

2 min read

Source: arXiv

Source: arXiv - 2606.05654v1

Overview

Hate moderation is often evaluated as classification on clean English inputs, but deployed systems must route content to actions such as ALLOW, FLAG, or REVIEW. We study how this workflow changes under code-mixed inputs using a paired evaluation setting where the same underlying content is expressed as clean English and Tamil-English code-mix. Under thresholds tuned on clean English development data, code-mixed inputs produce substantial action instability, with a paired clean- to-code-mix decision flip rate of 0.265. The main workflow effects are increased review burden and increased false-flagging of non-hateful content: review rate rises from 0.138 to 0.297 and non-hate false-flag rate rises from 0.069 to 0.104. Tamil-only inputs show stronger degradation overall, suggesting a broader language-coverage limitation rather than the same code-mixed instability pattern. A simple disagreement-based deferral rule reduces automatic errors on stressed inputs, but only by increasing review load. These results show that workflow-level evaluation reveals moderation failures that standard classification summaries can miss.

Key Contributions

This paper presents research in the following areas:

cs.SE
cs.AI
cs.LG

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.SE.

Authors

Suraj Babu Thimma Krishnaram

Paper Information

arXiv ID: 2606.05654v1
Categories: cs.SE, cs.AI, cs.LG
Published: June 4, 2026
PDF: Download PDF

[Paper] When Surface Form Changes Moderation Decisions: A Paired Study of Code-Mixed Workflow Instability

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] How reliable are LLMs when it comes to playing dice?

[Paper] MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

[Paper] Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

[Paper] Accelerated Decentralized Stochastic Gradient Descent for Strongly Convex Optimization