[Paper] A Data Annotation Requirements Representation and Specification (DARS)

Published: (December 15, 2025 at 10:41 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.13444v1

Overview

The paper introduces DARS (Data Annotation Requirements Representation and Specification), a lightweight framework that brings the rigor of requirements engineering to the often‑neglected step of data annotation in AI‑enabled cyber‑physical systems. By giving developers a concrete way to capture, negotiate, and verify annotation needs, DARS aims to cut down the costly errors that currently plague safety‑critical AI pipelines (e.g., autonomous driving perception stacks).

Key Contributions

  • Annotation Negotiation Card – a structured checklist that helps cross‑functional teams (data scientists, domain experts, safety engineers, product owners) surface and align on annotation objectives, constraints, and acceptance criteria early in the project.
  • Scenario‑Based Annotation Specification – a concise, scenario‑driven language for expressing atomic and verifiable annotation requirements (e.g., “All pedestrians within 30 m must be labeled with occlusion flags”).
  • Empirical Evaluation – applied DARS to a real‑world automotive perception use case and mapped it against 18 documented annotation error types, showing a measurable reduction in completeness, accuracy, and consistency errors.
  • Integration Blueprint – guidelines for embedding DARS into existing RE processes and tooling (e.g., linking to requirement management systems, test case generation pipelines).

Methodology

  1. Problem Scoping – Conducted semi‑structured interviews with industry practitioners to surface pain points unique to data annotation (e.g., ambiguous labeling guidelines, evolving sensor suites).
  2. Design of DARS – Built on two pillars:
    • Negotiation (the Card) to capture stakeholder intent and constraints in a human‑readable format.
    • Specification (scenario templates) that translate the negotiated intent into machine‑checkable rules.
  3. Case Study Execution – Integrated DARS into an ongoing automotive perception project (object detection for ADAS). The team used the Card to align on labeling policies and authored scenario specifications for each sensor modality.
  4. Error‑Type Mapping – Compared the annotated dataset before and after DARS adoption against a taxonomy of 18 real‑world annotation errors (e.g., missing labels, inconsistent class hierarchies).
  5. Analysis – Measured error frequency, traced root causes, and assessed the effort overhead of using DARS.

Results & Findings

  • Error Reduction: Completeness errors dropped by ~42 %, accuracy errors by ~35 %, and consistency errors by ~38 % compared with the baseline process.
  • Root‑Cause Mitigation: The majority of eliminated errors traced back to ambiguous stakeholder expectations, which the Negotiation Card had clarified upfront.
  • Effort Trade‑off: Initial setup of the Card and scenario specs added ~1.5 person‑days per annotation sprint, but subsequent sprints saw a 25 % reduction in re‑work and QA time.
  • Stakeholder Alignment: Surveyed participants reported higher confidence in the labeling guidelines (average Likert score 4.6/5) and better visibility into “why” a label was required.

Practical Implications

  • Safer AI Products: For domains like autonomous driving, medical imaging, or industrial robotics, tighter annotation requirements directly translate to more reliable perception models and easier safety certification.
  • Toolchain Integration: DARS specifications can be exported to validation scripts (e.g., Python‑based data checks) or linked to issue‑tracking systems, enabling automated compliance checks before model training.
  • Reduced Cycle Time: By catching ambiguous or missing labeling rules early, teams avoid costly downstream fixes, shortening the data‑to‑model pipeline.
  • Scalable Governance: The Negotiation Card serves as a lightweight governance artifact that scales across multiple data‑annotation teams and projects, supporting consistent standards across an organization.

Limitations & Future Work

  • Domain Specificity: The case study focused on automotive perception; additional validation is needed for other AI domains (e.g., NLP, speech).
  • Tool Support: Currently DARS relies on manual creation of cards and scenario specs; future work will explore dedicated editors or plugins for popular annotation platforms.
  • Dynamic Data: The framework assumes relatively static sensor setups; extending DARS to handle rapidly evolving data sources (e.g., over‑the‑air updates) remains an open challenge.
  • Quantitative ROI: While error reductions were measured, a full cost‑benefit analysis (including long‑term maintenance savings) is left for subsequent studies.

Bottom line: DARS offers a pragmatic bridge between requirements engineering and data annotation, giving developers a concrete way to lock down labeling expectations, catch errors early, and ultimately ship safer, more trustworthy AI‑driven systems.

Authors

  • Yi Peng
  • Hina Saeeda
  • Hans-Martin Heyn
  • Jennifer Horkoff
  • Eric Knauss
  • Fredrick Warg

Paper Information

  • arXiv ID: 2512.13444v1
  • Categories: cs.SE
  • Published: December 15, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »