[Paper] ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models

Published: (January 7, 2026 at 12:45 PM EST)
3 min read
Source: arXiv

Source: arXiv - 2601.04131v1

Overview

Large Language Models (LLMs) excel at recalling facts they learned during pre‑training, but this strength can become a liability when the model is asked to answer using up‑to‑date external evidence that contradicts its internal knowledge. The paper ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models proposes a lightweight, inference‑only technique that nudges the model’s hidden activations toward the retrieved context, dramatically reducing “hallucinations” without costly fine‑tuning.

Key Contributions

  • Activation‑steering mechanism (ContextFocus) that selectively amplifies neurons aligned with the provided context while suppressing conflicting internal knowledge.
  • Zero‑fine‑tuning solution: the method works as a plug‑in at inference time, requiring only a few extra forward passes and no changes to the model weights.
  • Comprehensive benchmark evaluation on ConFiQA, showing consistent gains over strong baselines such as ContextDPO, COIECD, and prompting‑only approaches.
  • Demonstrated composability: ContextFocus can be stacked with prompt engineering techniques (e.g., chain‑of‑thought, retrieval‑augmented prompting) for additive improvements.
  • Scalability evidence: experiments on models ranging from 7 B to 70 B parameters confirm that the approach remains effective as model size grows.

Methodology

  1. Context Retrieval – An external knowledge source (e.g., a search engine or vector DB) returns a short passage relevant to the user query.
  2. Activation Mask Generation – The retrieved passage is tokenized and fed through the frozen LLM once to collect hidden‑state activations. A lightweight classifier (trained on a small held‑out set) predicts which neurons are “context‑relevant.”
  3. Steering at Inference – When the actual query is processed, the model’s hidden states are element‑wise multiplied by the previously computed mask, boosting context‑aligned neurons and damping those that would otherwise surface outdated memorized facts.
  4. Decoding – Standard decoding (e.g., nucleus sampling) proceeds on the steered activations, producing the final answer.

All steps are performed on‑the‑fly; the only learned component is the tiny mask‑predictor, which can be trained once and reused across tasks.

Results & Findings

ModelBaseline (no steering)ContextFocusΔ Faithfulness ↑Δ Fluency ↓
LLaMA‑7B62.3 %78.9 %+16.6 pp–0.3 pp
LLaMA‑13B66.1 %81.4 %+15.3 pp–0.2 pp
LLaMA‑70B71.8 %86.2 %+14.4 pp–0.1 pp
  • Contextual faithfulness (the proportion of answers that correctly cite the retrieved passage) improves by 14–17 percentage points across model sizes.
  • Fluency (measured by perplexity and human rating) remains essentially unchanged, confirming that steering does not degrade language quality.
  • When combined with a chain‑of‑thought prompt, ContextFocus yields an extra ~3 pp boost, indicating complementary effects.
  • Inference overhead is ~5 % extra compute and <10 ms latency on a single A100 GPU, far cheaper than full fine‑tuning (which can add hours of training time).

Practical Implications

  • Retrieval‑augmented applications (e.g., chat assistants, code‑search bots, fact‑checking tools) can integrate ContextFocus as a drop‑in module to make answers more trustworthy without re‑training the underlying LLM.
  • Rapid product iteration: Teams can experiment with new knowledge bases (news feeds, internal docs) and instantly see reduced hallucinations, accelerating time‑to‑market.
  • Cost‑effective compliance: Industries with strict factual accuracy requirements (finance, healthcare, legal) can meet regulatory standards while keeping inference budgets low.
  • Edge deployment: Because the method only adds a lightweight mask and a single extra forward pass, it fits on inference‑optimized hardware (e.g., NVIDIA Jetson, AWS Inferentia) where full model fine‑tuning is infeasible.

Limitations & Future Work

  • The mask‑predictor is trained on a modest validation set; its generalization to highly domain‑specific vocabularies (e.g., biomedical jargon) may need further fine‑tuning.
  • ContextFocus assumes the retrieved passage is itself accurate; if the external source is noisy, the steering could amplify incorrect information.
  • The current implementation works best with relatively short contexts (≤ 256 tokens); scaling to longer documents may require hierarchical masking strategies.
  • Future research directions include: learning dynamic masks conditioned on query difficulty, extending the approach to multimodal LLMs, and exploring joint training of the mask predictor with retrieval models for end‑to‑end optimization.

Authors

  • Nikhil Anand
  • Shwetha Somasundaram
  • Anirudh Phukan
  • Apoorv Saxena
  • Koyel Mukherjee

Paper Information

  • arXiv ID: 2601.04131v1
  • Categories: cs.CL, cs.AI, cs.LG
  • Published: January 7, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »