[Paper] Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds

Published: 3 days ago (February 23, 2026 at 01:30 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.20111v1

Overview

This paper tackles a realistic online‑learning scenario where most data points come from an unknown distribution, but an adversary can slip in maliciously crafted examples. The learner must decide when to predict and when to abstain, and it is penalized both for wrong predictions and for abstaining on clean (i.i.d.) rounds. The authors close a long‑standing gap between “distribution‑aware” and “distribution‑agnostic” algorithms, showing that the latter cannot beat a √T error term even for the simplest hypothesis classes.

Key Contributions

Tight lower bound: Prove an Ω(√T) combined‑error lower bound for any algorithm with VC dimension 1, establishing that the √T barrier is inherent for distribution‑agnostic learners.
Robust‑witness framework: Introduce a potential‑based analysis that relies on robust witnesses—small subsets of labeled examples that certify a prediction despite adversarial injections.
Two new combinatorial dimensions:
1. Inference dimension – yields a generic upper bound of (\tilde{O}(T^{1-1/k})) for classes whose inference dimension is (k).
2. Certificate dimension – a novel relaxation that leads to tighter bounds for specific hypothesis families.
Concrete algorithmic result: Show that 2‑dimensional half‑spaces have certificate dimension 3, giving the first distribution‑agnostic guarantee of (\tilde{O}(T^{2/3})) combined error for this class.
Separation of information regimes: Demonstrate a sharp contrast between algorithms that can query the underlying distribution (achieving (O(d^2\log T)) error) and those that cannot (stuck at √T).

Methodology

Adversarial injection model – The data stream is a mix of clean i.i.d. examples from an unknown distribution (\mathcal{D}) and adversarially inserted points. Labels are always consistent with a hidden target concept (clean‑label setting).
Abstention mechanism – The learner may output “I don’t know”. Errors are counted as:
- Mistake: predicting the wrong label on a clean round.
- Incorrect abstention: abstaining on a clean round (adversarial rounds are ignored for the abstention penalty).
Robust witnesses – For any prediction the algorithm maintains a small set of past examples that certify the label. The set is constructed so that even if the adversary injects up to a certain fraction of malicious points, the witness still forces the same prediction.
Potential‑based analysis – A global potential function tracks the “budget” of remaining witnesses. Each mistake or abstention reduces the potential by a bounded amount, leading to the derived error bounds.
Combinatorial dimensions –
- Inference dimension measures how many examples are needed to infer the label of a new point via witnesses.
- Certificate dimension is a relaxed version that allows a slightly larger witness set but yields better bounds for certain geometric classes (e.g., 2‑D half‑spaces).

The authors instantiate the framework with these dimensions, deriving concrete error rates for various hypothesis families.

Results & Findings

Setting	Hypothesis class	Combinatorial dimension	Upper bound on combined error
General VC‑d	VC‑dimension d	–	(O(d^2 \log T)) with distribution oracle
Distribution‑agnostic	VC‑dimension 1	–	Ω(√T) lower bound (tight)
Inference‑dimension k	Any class with inference dimension k	k	(\tilde{O}(T^{1-1/k}))
Certificate dimension c	Classes with certificate dimension c (e.g., 2‑D half‑spaces, c = 3)	c	(\tilde{O}(T^{1-1/c}) = \tilde{O}(T^{2/3}))

Key takeaways

No distribution‑agnostic algorithm can beat the √T barrier for even the simplest non‑trivial hypothesis class.
By exploiting structural properties captured by inference or certificate dimensions, one can obtain sub‑√T rates for richer classes.
The new certificate dimension bridges a gap left by earlier work that showed half‑spaces are not robustly learnable without abstention.

Practical Implications

Robust online services: Systems that must make real‑time decisions (spam filters, intrusion detection, recommendation engines) can now incorporate a principled abstention strategy that guarantees bounded error even when an attacker injects poisoned data.
Model‑agnostic safety nets: The framework does not rely on knowing the data distribution, making it suitable for edge devices or federated learning scenarios where distributional assumptions are unrealistic.
Algorithmic templates: The robust‑witness potential method can be wrapped around existing online learners (e.g., perceptron, decision trees) to add an abstention layer with provable guarantees.
Geometric learning: For applications involving low‑dimensional linear classifiers (computer vision bounding‑box filters, 2‑D sensor fusion), the certificate‑dimension result gives a concrete performance target—developers can aim for (\tilde{O}(T^{2/3})) error rather than the generic √T.
Security‑by‑design: The lower bound tells security engineers that, without distribution knowledge, any system must accept a √T “cost of safety”. This informs threat models and budget allocations for monitoring or human‑in‑the‑loop verification.

Limitations & Future Work

Tightness only for VC = 1: The Ω(√T) lower bound is proved for the simplest case; extending tight lower bounds to higher VC dimensions remains open.
Computational overhead: Maintaining robust witnesses and updating the potential can be expensive for large‑scale streams; practical implementations will need efficient data structures or approximations.
Certificate dimension characterization: While the paper shows half‑spaces in ℝ² have certificate dimension 3, the dimension for higher‑dimensional or more complex hypothesis classes is not yet known.
Adversarial budget: The model assumes the learner does not know which rounds are adversarial but does not bound the total number of injections; exploring trade‑offs when the adversary’s power is limited could yield stronger guarantees.
Extension to multi‑class and regression: The current analysis focuses on binary classification with clean labels; adapting the framework to multi‑label or continuous outputs is a natural next step.

Authors

Ezra Edelman
Surbhi Goel

Paper Information

arXiv ID: 2602.20111v1
Categories: cs.LG
Published: February 23, 2026
PDF: Download PDF

[Paper] Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets

[Paper] Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes

[Paper] GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

[Paper] Surrogate models for Rock-Fluid Interaction: A Grid-Size-Invariant Approach