[Paper] Ranking-Enhanced Anomaly Detection Using Active Learning-Assisted Attention Adversarial Dual AutoEncoders

Published: 2 months ago (November 25, 2025 at 11:42 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2511.20480v1

Overview

This paper tackles one of the toughest problems in cybersecurity: spotting Advanced Persistent Threats (APTs) that hide in massive streams of system‑level logs. Because labeled attack data are extremely scarce, the authors combine unsupervised auto‑encoders with an active‑learning loop that asks a human (or oracle) to label only the most ambiguous samples. The result is a “ranking‑enhanced” detector that quickly learns to flag the rare APT events while keeping labeling effort to a minimum.

Key Contributions

Dual AutoEncoder architecture with attention and adversarial training that learns richer representations of provenance traces.
Active‑learning‑assisted ranking: the model scores unlabeled samples by uncertainty, queries the oracle for the top‑k, and re‑trains iteratively.
Comprehensive evaluation on DARPA Transparent Computing provenance datasets covering Android, Linux, BSD, and Windows, where APT‑like attacks make up only 0.004 % of the data.
Empirical evidence of superior detection rates compared with state‑of‑the‑art unsupervised and semi‑supervised anomaly detectors.
A practical workflow that can be plugged into existing security operation centers (SOCs) to reduce manual labeling overhead.

Methodology

Data Representation – Raw system calls and file‑access events are transformed into provenance graphs (nodes = processes/files, edges = interactions). These graphs are flattened into sequences and fed to the auto‑encoders.
Dual AutoEncoder – Two parallel auto‑encoders (one for reconstruction, one for adversarial generation) share an attention module that highlights the most informative parts of the input sequence. The reconstruction error serves as an initial anomaly score.
Active Learning Loop
- Uncertainty Ranking: For each unlabeled trace, the model computes a confidence margin (difference between top‑2 class probabilities) and a reconstruction‑error rank.
- Query Selection: The top‑N uncertain traces are sent to a human analyst (the “oracle”) for labeling.
- Model Update: Labeled samples are added to the training set; the dual auto‑encoders are fine‑tuned, and the attention weights are re‑calibrated.
- This cycle repeats until a stopping criterion (e.g., budget exhausted or performance plateau) is met.
Evaluation Metrics – Precision, recall, F1‑score, and Area‑Under‑Precision‑Recall (AUPR) are reported, focusing on the minority APT class.

Results & Findings

Dataset (OS)	Baseline (plain AE)	Proposed Dual AE + AL	Relative Gain
Android	Recall 0.31, AUPR 0.12	Recall 0.58, AUPR 0.27	+87 % recall
Linux	Recall 0.28, AUPR 0.10	Recall 0.55, AUPR 0.24	+96 % recall
BSD	Recall 0.33, AUPR 0.13	Recall 0.60, AUPR 0.29	+82 % recall
Windows	Recall 0.30, AUPR 0.11	Recall 0.57, AUPR 0.26	+90 % recall

Active learning reduces labeling cost: only ~1 % of the total traces needed to be manually labeled to achieve >50 % recall.
Attention improves interpretability: heat‑maps over the provenance graph highlight the exact system calls that contributed most to the anomaly score, aiding analyst triage.
Robustness across OSes: The same hyper‑parameters worked for all four operating systems, demonstrating the method’s generality.

Practical Implications

SOC Integration – The framework can sit on top of existing log‑ingestion pipelines (e.g., Elastic Stack, Splunk) and continuously propose “high‑uncertainty” alerts for analyst review, dramatically cutting the time spent on false positives.
Label‑Efficient Threat Hunting – Teams can bootstrap an APT detection model with just a handful of verified incidents, then let the active‑learning loop expand coverage automatically.
Cross‑Platform Security – Because the model operates on provenance graphs rather than OS‑specific signatures, it can be deployed in heterogeneous environments (cloud VMs, containers, mobile devices) without retraining from scratch.
Explainable AI for Audits – The attention heat‑maps provide a visual audit trail that satisfies compliance requirements (e.g., GDPR, NIST) when justifying why a particular activity was flagged.

Limitations & Future Work

Oracle Dependency – The approach assumes a reliable human analyst to provide correct labels; noisy or delayed feedback could degrade performance.
Scalability of Graph Construction – Building provenance graphs for high‑throughput environments may become a bottleneck; the authors suggest incremental graph updates as a next step.
Adversarial Robustness – While an adversarial auto‑encoder is used for representation learning, the paper does not evaluate resistance against deliberately crafted evasion attacks.
Future Directions – Extending the method to streaming data (online learning), incorporating threat‑intel feeds for richer context, and exploring self‑supervised pre‑training on massive unlabeled logs.

Authors

Sidahmed Benabderrahmane
James Cheney
Talal Rahwan

Paper Information

arXiv ID: 2511.20480v1
Categories: cs.LG, cs.AI, cs.CR, cs.NE
Published: November 25, 2025
PDF: Download PDF

[Paper] Ranking-Enhanced Anomaly Detection Using Active Learning-Assisted Attention Adversarial Dual AutoEncoders

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction

[Paper] ThetaEvolve: Test-time Learning on Open Problems

[Paper] The Price of Progress: Algorithmic Efficiency and the Falling Cost of AI Inference

[Paper] Physics-Informed Neural Networks for Thermophysical Property Retrieval