[Paper] Mechanism-Guided Selective Unlearning for RLVR-Induced Reasoning

발행: 1일 전 (2026년 6월 18일 AM 12:59 GMT+9)

2 분 소요

원문: arXiv

출처: arXiv - 2606.19222v1

Overview

We propose MAST (Mechanism-Aligned Selective Targeting), a mechanism-guided method for unlearning RLVR-induced reasoning with substantially lower collateral damage than standard full-parameter updates. In matched SFT/RLVR checkpoints on Qwen2.5-Math-1.5B and Qwen3-1.7B-Base, the SFT-to-RLVR increment differs sharply from the SFT update in token-level delta-log-probability, and full-parameter gradient ascent forgets only by damaging retain MATH and GSM8K. MAST ranks attention-projection tensors by off-principal energy, update magnitude, and forget-gradient coupling magnitude, then updates only the top-ranked subset. On the primary model, MAST induces statistically significant target forgetting (MATH forget 45/150 to 37/150; McNemar p=0.0078) while preserving GSM8K (+0.8 pp) and MATH retain (-0.5 pp). The advantage reproduces across seeds, NPO/SimNPO objectives, and Qwen3, where MAST preserves GSM8K while full-parameter unlearning collapses it.

Key Contributions

This paper presents research in the following areas:

cs.LG
cs.AI

이 논문은 다음과 같은 분야를 다룹니다:

cs.LG
cs.AI

Methodology

Please refer to the full paper for detailed methodology.
자세한 내용은 전체 논문을 참고하십시오.

Practical Implications

This research contributes to the advancement of cs.LG.
이 연구는 cs.LG의 발전에 기여합니다.

Authors

Chenyu Zhou
Qiliang Jiang
Shuning Wu
Xu Zhou

Paper Information

arXiv ID: 2606.19222v1
Categories: cs.LG, cs.AI
Published: June 17, 2026
PDF: Download PDF

[Paper] Mechanism-Guided Selective Unlearning for RLVR-Induced Reasoning

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

관련 글

LOCUS로 법을 해방시키다: 미국 지역 조례 코퍼스

[논문] 차이나‑가이아 대응 카탈로그: 머신러닝으로 차이나 소스 카탈로그에 있는 애매한 가이아 대조를 X‑선 원천과 연결

[Paper] Rethinking Reward Supervision: Rubric-Conditioned Self-Distillation

[논문] 참조 기반 다중 스피커 오디오 씬 생성