[Paper] CNT: Safety-oriented Function Reuse across LLMs via Cross-Model Neuron Transfer

Published: 1 day ago (March 18, 2026 at 11:21 PM EDT)

2 min read

Source: arXiv

Source: arXiv - 2603.18449v1

Overview

The widespread deployment of large language models (LLMs) calls for post-hoc methods that can flexibly adapt models to evolving safety requirements. Meanwhile, the rapidly expanding open-source LLM ecosystem has produced a diverse collection of models that already exhibit various safety-related functionalities. This motivates a shift from constructing safety functionality from scratch to reusing existing functionality from external models, thereby avoiding costly data collection and training procedures. In this paper, we present Cross-Model Neuron Transfer (CNT), a post-hoc method that reuses safety-oriented functionality by transferring a minimal subset of neurons from an open-source donor LLM to a target LLM. By operating at the neuron level, CNT enables modular function-level adaptation, supporting both function addition andfunction deletion. We evaluate CNT on seven popular LLMs across three representative applications: safety disalignment, alignment enhancement, and bias removal. Experimental results show that CNT achieves targeted safety-oriented functionality transfer with minimal performance degradation (less than 1% for most models), consistently outperforming five baselines, demonstrating its generality and practical effectiveness.

Key Contributions

This paper presents research in the following areas:

cs.CR
cs.SE

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.CR.

Authors

Yue Zhao
Yujia Gong
Ruigang Liang
Shenchen Zhu
Kai Chen
Xuejing Yuan
Wangjun Zhang

Paper Information

arXiv ID: 2603.18449v1
Categories: cs.CR, cs.SE
Published: March 19, 2026
PDF: Download PDF

[Paper] CNT: Safety-oriented Function Reuse across LLMs via Cross-Model Neuron Transfer

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] Beyond the Code: A Multi-Modal Assessment Strategy for Fostering Professional Competencies via Introductory Programming Projects

[Paper] SpaceTime Programming: Live and Omniscient Exploration of Code and Execution

[Paper] Green Architectural Tactics in ML-enabled Systems: An LLM-based Repository Mining Study

[Paper] Cross-Ecosystem Vulnerability Analysis for Python Applications