[Paper] MicLog: Towards Accurate and Efficient LLM-based Log Parsing via Progressive Meta In-Context Learning
Source: arXiv - 2601.07005v1
Overview
Log parsing—turning raw, semi‑structured log lines into clean, structured templates—is a prerequisite for any downstream log analytics, anomaly detection, or observability pipeline. The new MicLog framework shows how to harness small, open‑source large language models (LLMs) with a progressive meta‑in‑context learning strategy to dramatically boost parsing accuracy while slashing the time and cost of LLM queries.
Key Contributions
- ProgMeta‑ICL paradigm: Introduces a zero‑shot‑to‑k‑shot progressive meta‑in‑context learning loop that teaches a tiny LLM (Qwen‑2.5‑3B) to improve its own few‑shot performance over time.
- Smart demonstration selection: Combines weighted DBSCAN clustering for candidate sampling with an enhanced BM25 ranking to pick the most informative examples for each new log line.
- Multi‑level pre‑query cache: Stores recently parsed templates and re‑uses them across logs, cutting redundant LLM calls and reducing latency.
- Open‑source LLM focus: Demonstrates that a 3‑billion‑parameter model can outperform larger proprietary LLMs when equipped with the right meta‑learning and caching tricks.
- Empirical gains: On the Loghub‑2.0 benchmark, MicLog lifts parsing accuracy by 10.3 % over the previous best method and speeds up processing by 42.4 %.
Methodology
-
Progressive Meta‑Learning Loop
- Start with zero‑shot prompts (no examples).
- After the model parses a batch of logs, the system extracts the successful templates and treats them as new “demonstrations.”
- In the next iteration, the model receives k‑shot prompts that include these freshly mined examples, gradually enriching its context.
-
Weighted DBSCAN Candidate Sampling
- Log lines are embedded (e.g., using a lightweight sentence encoder).
- DBSCAN clusters similar lines; a weighting scheme favors dense, high‑confidence clusters when picking candidates for the prompt.
-
Enhanced BM25 Demonstration Selection
- Within each cluster, a BM25‑style relevance score ranks candidate examples against the target log line, ensuring the most semantically aligned demonstrations are used.
-
Multi‑Level Pre‑Query Cache
- Level 1: Exact‑match cache for previously seen log lines.
- Level 2: Template‑match cache that stores parsed templates; a new line that matches an existing template bypasses the LLM entirely.
- Level 3: Fallback to the full ProgMeta‑ICL pipeline when no cache hit occurs.
-
LLM Backend
- The entire pipeline runs on Qwen‑2.5‑3B, a publicly available 3‑billion‑parameter model, keeping inference costs low while still benefiting from the meta‑learning enhancements.
Results & Findings
| Metric | MicLog | Prior SOTA (LLM‑based) |
|---|---|---|
| Parsing Accuracy (Loghub‑2.0) | 91.2 % | 81.0 % |
| Avg. Parsing Time per 1 k logs | 0.68 s | 1.18 s |
| LLM API Calls (per 1 k logs) | ≈ 120 | ≈ 210 |
- Accuracy boost stems from the model’s ability to adapt its prompt with freshly mined, domain‑specific examples, effectively “learning on the fly.”
- Speedup is largely due to the cache layers; over 60 % of log lines hit Level 2 or Level 1, avoiding any LLM inference.
- Even with a modest 3 B‑parameter model, MicLog outperforms larger proprietary LLMs that rely on static few‑shot prompts, highlighting the power of progressive meta‑learning.
Practical Implications
- Cost‑Effective Observability: Teams can deploy high‑accuracy log parsers on commodity hardware without paying for expensive API calls to GPT‑4‑style services.
- Rapid Adaptation to Log Drift: As services evolve and log formats change, MicLog automatically incorporates new patterns into its demonstration pool, reducing the need for manual parser updates.
- Plug‑and‑Play Integration: The cache‑first design fits naturally into existing log pipelines (e.g., Fluent Bit → MicLog → Elasticsearch) with minimal latency overhead.
- Open‑Source Friendly: Since the backbone is an open‑source LLM, organizations can audit, fine‑tune, or extend the model to meet compliance or security requirements.
- Generalizable Framework: The ProgMeta‑ICL recipe can be repurposed for other semi‑structured data extraction tasks such as config file parsing, network packet classification, or even code comment generation.
Limitations & Future Work
- Domain Coverage: While MicLog adapts quickly, its initial zero‑shot performance still depends on the base LLM’s pre‑training data; extremely niche log vocabularies may need a brief warm‑up phase.
- Cache Management Overhead: The multi‑level cache introduces stateful components that must be persisted and evicted intelligently in long‑running services.
- Scalability to Massive Log Volumes: Experiments were run on Loghub‑2.0 (≈ 10 M lines). Scaling to petabyte‑scale streams may require distributed caching and sharding strategies.
- Meta‑Learning Extensions: Future work could explore reinforcement‑learning‑based reward signals for demonstration selection or incorporate lightweight fine‑tuning on the fly to further close the gap with larger LLMs.
MicLog demonstrates that with clever prompting, meta‑learning, and caching, even modest LLMs can become production‑grade log parsers—opening the door for more affordable, adaptable observability stacks.
Authors
- Jianbo Yu
- Yixuan Li
- Hai Xu
- Kang Xu
- Junjielong Xu
- Zhijing Li
- Pinjia He
- Wanyuan Wang
Paper Information
- arXiv ID: 2601.07005v1
- Categories: cs.SE, cs.AI
- Published: January 11, 2026
- PDF: Download PDF