[Paper] LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware Detection
Source: arXiv - 2602.11655v1
Overview
This paper tackles a pressing problem: how to keep malware detectors on edge devices (e.g., IoT gateways, smartphones) up‑to‑date without blowing their limited CPU, memory, and bandwidth budgets. The authors propose a continuous‑learning pipeline that couples tiny transformer models with LoRA (Low‑Rank Adaptation) adapters, allowing each device to learn locally from its own traffic while sharing only a few kilobytes of model updates with a central coordinator. The result is a system that stays accurate against ever‑changing threats and can transfer knowledge across heterogeneous devices.
Key Contributions
- Edge‑friendly architecture that merges local incremental fine‑tuning with global knowledge aggregation via LoRA adapters.
- Parameter‑efficient updates: LoRA adds < 1 % extra parameters (≈ 0.6–1.8 MB) to models such as DistilBERT, DistilGPT‑2, and TinyT5, making OTA updates feasible on constrained hardware.
- Cross‑domain knowledge sharing without transmitting raw traffic data, preserving privacy and reducing bandwidth.
- Empirical validation on two real‑world IoT security datasets (Edge‑IIoTset, TON‑IoT) showing 20–25 % accuracy improvements over isolated fine‑tuning when facing unseen attacks.
- Stability across learning rounds: loss and F1 scores remain steady despite continuous model drift, demonstrating robustness of the LoRA‑based aggregation.
Methodology
- Base Model Selection – The authors start with lightweight transformer variants (DistilBERT, DistilGPT‑2, TinyT5) that already fit within edge memory limits.
- Local Adaptation – Each edge node receives a stream of network traffic, labels it (e.g., via a lightweight IDS or manual annotation), and fine‑tunes the base model only on the LoRA adapters (low‑rank matrices inserted into attention/feed‑forward layers). The original weights stay frozen, so training is fast and memory‑light.
- Adapter Extraction & Aggregation – After a local training epoch, the device uploads its LoRA parameters (a few hundred KB) to a central coordinator. The coordinator averages or otherwise merges these adapters (similar to federated averaging) to produce a global LoRA module.
- Redistribution – The global LoRA is broadcast back to all devices, which simply replace their local adapters with the new one, instantly gaining knowledge learned elsewhere.
- Iterative Rounds – The process repeats over multiple rounds, simulating the arrival of new malware families and traffic patterns. No raw packets or model weights are ever exchanged, keeping privacy intact.
Results & Findings
| Metric | Isolated Fine‑Tuning | LoRA‑Shared (multi‑round) |
|---|---|---|
| Accuracy (unseen attacks) | ~68 % | 88–93 % (+20‑25 %) |
| F1‑Score (overall) | 0.71 | 0.89 |
| Model size increase | – | < 1 % (0.6–1.8 MB) |
| Communication per round | N/A (no sharing) | ~0.8 MB per device |
- Cross‑domain boost: When a device encounters a malware family that only appeared on another device’s dataset, the LoRA‑shared model correctly classifies it far more often than a locally‑only model.
- Stable training dynamics: Across 5‑6 continuous learning rounds, loss curves do not diverge, indicating that the aggregated adapters do not introduce catastrophic forgetting.
- Resource feasibility: On a Raspberry Pi 4 (2 GB RAM), inference latency stays under 150 ms per packet batch, and a full LoRA update download completes in < 2 seconds over a typical 1 Mbps link.
Practical Implications
- Deployable IDS on constrained hardware – Developers can embed a tiny transformer + LoRA stack into existing edge agents (e.g., OpenWrt, Azure IoT Edge) and keep it current without full model re‑downloads.
- Privacy‑preserving threat intelligence – Organizations can benefit from collective learning across devices (similar to federated learning) while never exposing raw network logs, easing compliance with GDPR or HIPAA.
- Rapid response to zero‑day malware – As soon as one node flags a novel pattern, its LoRA update propagates, giving the whole fleet an immediate defensive boost.
- Cost‑effective OTA updates – Since only a few megabytes are transferred per round, OTA pipelines already used for firmware can handle security updates with negligible extra bandwidth.
- Framework‑agnostic integration – LoRA adapters are compatible with Hugging Face Transformers, making it straightforward to plug into existing Python‑based security pipelines or convert to ONNX/TFLite for C/C++ edge runtimes.
Limitations & Future Work
- Dataset scope – Experiments are limited to two IoT datasets; real‑world deployments may encounter more diverse protocols and higher‑dimensional feature spaces.
- Security of the aggregation channel – The paper assumes a trusted coordinator; future work should explore authenticated, encrypted aggregation and robustness against poisoned LoRA updates.
- Model heterogeneity – All devices share the same base transformer; extending the approach to heterogeneous model families (e.g., CNN‑based IDS) remains an open challenge.
- Long‑term drift – While short‑term stability is shown, the impact of months‑long continuous learning on model bias and false‑positive rates needs further study.
Bottom line: By marrying tiny transformers with LoRA’s parameter‑efficient adapters, the authors deliver a practical, privacy‑aware, and bandwidth‑light solution for continuous malware detection on the edge—an approach that developers can start experimenting with today.
Authors
- Christian Rondanini
- Barbara Carminati
- Elena Ferrari
- Niccolò Lardo
- Ashish Kundu
Paper Information
- arXiv ID: 2602.11655v1
- Categories: cs.CR, cs.AI, cs.DC
- Published: February 12, 2026
- PDF: Download PDF