[Paper] SMART SLM: Structured Memory and Reasoning Transformer, A Small Language Model for Accurate Document Assistance

Published: 1 month ago (December 24, 2025 at 11:59 AM EST)

3 min read

Source: arXiv

Source: arXiv - 2512.21280v1

Overview

The SMART SLM (Structured Memory and Reasoning Transformer) tackles a common pain point for engineers: extracting accurate, numeric information from massive, densely formatted engineering manuals. By turning the raw text into a hierarchy of structured facts and pairing it with a lightweight memory‑augmented transformer, SMART delivers higher accuracy than larger models like GPT‑2 while using far fewer parameters.

Key Contributions

Hierarchical fact extraction via a syntax‑aware Tree‑LSTM (“Grammarian”) that converts sentences into subject‑relation‑object triples.
Compact indexed memory (a 384‑dimensional vector store) that links each fact to its source location, enabling fast look‑ups.
Six‑layer transformer decoder that fuses retrieved facts to generate context‑aware answers.
Dual‑mode inference:
1. Fast‑path for known, pre‑indexed manuals (sub‑second latency).
2. Dynamic‑path for newly uploaded documents using a RAG‑style FAISS top‑20 retrieval with a 64‑slot memory buffer.
Parameter efficiency: 45.5 M parameters (≈ 64 % fewer than GPT‑2) with a 21.3 % boost in accuracy on engineering‑manual QA tasks.

Methodology

Fact Extraction (Grammarian)
- Each sentence from an engineering manual is parsed by a Tree‑LSTM that respects the grammatical tree.
- The model outputs subject‑relation‑object (SRO) triples, e.g., (Pump, operates‑at, 150 psi).
Structured Memory Indexing
- Every SRO triple is embedded into a 384‑dimensional vector.
- Vectors are stored in a Memory‑Augmented Neural Network (MANN) that also records the original page/section reference.
Retrieval & Fusion
- At query time, the user’s question is encoded and used to retrieve the most relevant fact vectors (FAISS nearest‑neighbor search).
- Retrieved vectors are fed into a 6‑layer transformer that attends over them and the query, producing a concise, fact‑grounded answer.
Inference Paths
- Fast‑path: For manuals already indexed, the system bypasses the heavy retrieval step and directly fetches the pre‑computed fact vectors.
- Dynamic‑path: For new documents, a lightweight RAG‑style pipeline builds a temporary index on‑the‑fly (max 64 slots) and then proceeds as above.

Results & Findings

Model	Parameters	QA Accuracy (Engineering Manuals)	Avg. Latency
BERT (base)	133 M	68.1 %	1.8 s
GPT‑2 (124 M)	124 M	71.4 %	2.1 s
SMART SLM	45.5 M	86.7 %	0.9 s (fast‑path)

Accuracy gain: SMART outperforms GPT‑2 by 21.3 % despite using less than half the parameters.
Hallucination reduction: Structured fact grounding cuts spurious numeric answers by ~40 % compared to baseline transformers.
Scalability: Adding new manuals incurs only a brief indexing cost (≈ 2 seconds) before the fast‑path becomes available.

Practical Implications

Engineering support tools: Integrate SMART into maintenance portals, allowing technicians to query manuals instantly for specs, tolerances, or step‑by‑step procedures.
Compliance & safety: Because answers are traceable to source sections, auditors can verify that the model’s output matches documented standards.
Edge deployment: The modest 45 M‑parameter footprint fits on modern GPUs or even high‑end CPUs, enabling on‑premise installations where data privacy is critical.
Reduced development cost: Companies can replace larger, more expensive LLM APIs with a self‑hosted SMART instance, cutting both inference spend and latency.

Limitations & Future Work

Domain specificity: SMART is tuned for engineering manuals; performance on other technical domains (e.g., medical guidelines) remains untested.
Memory size bound: The dynamic path caps the memory at 64 slots, which may truncate information for extremely large new documents.
Fact extraction errors: The Tree‑LSTM parser can mis‑identify relations in poorly formatted PDFs, leading to downstream inaccuracies.
Future directions suggested by the authors include: expanding the memory to a hierarchical, multi‑level index, adapting the Grammarian to multimodal inputs (tables, diagrams), and evaluating cross‑domain transfer with minimal re‑training.

Authors

Divij Dudeja
Mayukha Pal

Paper Information

arXiv ID: 2512.21280v1
Categories: cs.CL, cs.AI
Published: December 24, 2025
PDF: Download PDF

[Paper] SMART SLM: Structured Memory and Reasoning Transformer, A Small Language Model for Accurate Document Assistance

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] A2P-Vis: an Analyzer-to-Presenter Agentic Pipeline for Visual Insights Generation and Reporting

[Paper] Introducing TrGLUE and SentiTurca: A Comprehensive Benchmark for Turkish General Language Understanding and Sentiment Analysis

[Paper] Unifying Learning Dynamics and Generalization in Transformers Scaling Law

[Paper] Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty