[Paper] LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware Detection

Published: 3 days ago (February 12, 2026 at 02:20 AM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.11655v1

Overview

This paper tackles a pressing problem: how to keep malware detectors on edge devices (e.g., IoT gateways, smartphones) up‑to‑date without blowing their limited CPU, memory, and bandwidth budgets. The authors propose a continuous‑learning pipeline that couples tiny transformer models with LoRA (Low‑Rank Adaptation) adapters, allowing each device to learn locally from its own traffic while sharing only a few kilobytes of model updates with a central coordinator. The result is a system that stays accurate against ever‑changing threats and can transfer knowledge across heterogeneous devices.

Key Contributions

Edge‑friendly architecture that merges local incremental fine‑tuning with global knowledge aggregation via LoRA adapters.
Parameter‑efficient updates: LoRA adds < 1 % extra parameters (≈ 0.6–1.8 MB) to models such as DistilBERT, DistilGPT‑2, and TinyT5, making OTA updates feasible on constrained hardware.
Cross‑domain knowledge sharing without transmitting raw traffic data, preserving privacy and reducing bandwidth.
Empirical validation on two real‑world IoT security datasets (Edge‑IIoTset, TON‑IoT) showing 20–25 % accuracy improvements over isolated fine‑tuning when facing unseen attacks.
Stability across learning rounds: loss and F1 scores remain steady despite continuous model drift, demonstrating robustness of the LoRA‑based aggregation.

Methodology

Base Model Selection – The authors start with lightweight transformer variants (DistilBERT, DistilGPT‑2, TinyT5) that already fit within edge memory limits.
Local Adaptation – Each edge node receives a stream of network traffic, labels it (e.g., via a lightweight IDS or manual annotation), and fine‑tunes the base model only on the LoRA adapters (low‑rank matrices inserted into attention/feed‑forward layers). The original weights stay frozen, so training is fast and memory‑light.
Adapter Extraction & Aggregation – After a local training epoch, the device uploads its LoRA parameters (a few hundred KB) to a central coordinator. The coordinator averages or otherwise merges these adapters (similar to federated averaging) to produce a global LoRA module.
Redistribution – The global LoRA is broadcast back to all devices, which simply replace their local adapters with the new one, instantly gaining knowledge learned elsewhere.
Iterative Rounds – The process repeats over multiple rounds, simulating the arrival of new malware families and traffic patterns. No raw packets or model weights are ever exchanged, keeping privacy intact.

Results & Findings

Metric	Isolated Fine‑Tuning	LoRA‑Shared (multi‑round)
Accuracy (unseen attacks)	~68 %	88–93 % (+20‑25 %)
F1‑Score (overall)	0.71	0.89
Model size increase	–	< 1 % (0.6–1.8 MB)
Communication per round	N/A (no sharing)	~0.8 MB per device

Cross‑domain boost: When a device encounters a malware family that only appeared on another device’s dataset, the LoRA‑shared model correctly classifies it far more often than a locally‑only model.
Stable training dynamics: Across 5‑6 continuous learning rounds, loss curves do not diverge, indicating that the aggregated adapters do not introduce catastrophic forgetting.
Resource feasibility: On a Raspberry Pi 4 (2 GB RAM), inference latency stays under 150 ms per packet batch, and a full LoRA update download completes in < 2 seconds over a typical 1 Mbps link.

Practical Implications

Deployable IDS on constrained hardware – Developers can embed a tiny transformer + LoRA stack into existing edge agents (e.g., OpenWrt, Azure IoT Edge) and keep it current without full model re‑downloads.
Privacy‑preserving threat intelligence – Organizations can benefit from collective learning across devices (similar to federated learning) while never exposing raw network logs, easing compliance with GDPR or HIPAA.
Rapid response to zero‑day malware – As soon as one node flags a novel pattern, its LoRA update propagates, giving the whole fleet an immediate defensive boost.
Cost‑effective OTA updates – Since only a few megabytes are transferred per round, OTA pipelines already used for firmware can handle security updates with negligible extra bandwidth.
Framework‑agnostic integration – LoRA adapters are compatible with Hugging Face Transformers, making it straightforward to plug into existing Python‑based security pipelines or convert to ONNX/TFLite for C/C++ edge runtimes.

Limitations & Future Work

Dataset scope – Experiments are limited to two IoT datasets; real‑world deployments may encounter more diverse protocols and higher‑dimensional feature spaces.
Security of the aggregation channel – The paper assumes a trusted coordinator; future work should explore authenticated, encrypted aggregation and robustness against poisoned LoRA updates.
Model heterogeneity – All devices share the same base transformer; extending the approach to heterogeneous model families (e.g., CNN‑based IDS) remains an open challenge.
Long‑term drift – While short‑term stability is shown, the impact of months‑long continuous learning on model bias and false‑positive rates needs further study.

Bottom line: By marrying tiny transformers with LoRA’s parameter‑efficient adapters, the authors deliver a practical, privacy‑aware, and bandwidth‑light solution for continuous malware detection on the edge—an approach that developers can start experimenting with today.

Authors

Christian Rondanini
Barbara Carminati
Elena Ferrari
Niccolò Lardo
Ashish Kundu

Paper Information

arXiv ID: 2602.11655v1
Categories: cs.CR, cs.AI, cs.DC
Published: February 12, 2026
PDF: Download PDF

[Paper] LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware Detection

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

[Paper] UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

[Paper] AttentionRetriever: Attention Layers are Secretly Long Document Retrievers

[Paper] Agentic Test-Time Scaling for WebAgents