[Paper] FaultXformer: A Transformer-Encoder Based Fault Classification and Location Identification model in PMU-Integrated Active Electrical Distribution System

Published: 3 days ago (February 27, 2026 at 01:28 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2602.24254v1

Overview

The paper introduces FaultXformer, a Transformer‑encoder model that automatically classifies fault types and pinpoints their locations in active distribution grids equipped with Phasor Measurement Units (PMUs). By leveraging high‑resolution current waveforms, the approach delivers near‑perfect accuracy even when the network hosts many Distributed Energy Resources (DERs), a scenario that traditionally challenges conventional fault‑analysis tools.

Key Contributions

Dual‑stage Transformer pipeline that first learns rich temporal representations from raw PMU current streams, then separately predicts fault type and fault location.
Real‑time fault analysis using only four strategically placed PMUs, reducing sensor deployment costs.
Comprehensive evaluation on a realistic IEEE‑13 node feeder with 20 fault locations and multiple DER penetration levels, employing stratified 10‑fold cross‑validation.
Performance gains over strong baselines (CNN, RNN, LSTM): up to +34.95 % in fault‑type accuracy and +40.89 % in location accuracy.
Open‑source‑ready architecture that can be plugged into existing distribution‑management systems (DMS) with minimal code changes.

Methodology

Data acquisition – Simulated the IEEE‑13 node test feeder in DIgSILENT/PSCAD, injecting faults at 20 distinct nodes while varying DER outputs (solar, storage, etc.). Four PMUs recorded three‑phase currents at 60 Hz, producing time‑series windows of 0.5 s per event.
Stage 1 – Temporal feature extraction – A standard Transformer encoder (multi‑head self‑attention + feed‑forward layers) ingests the raw current vectors. Self‑attention lets the model weigh critical instants (e.g., the instant of current surge) across the whole window, capturing both short‑ and long‑range dependencies without recurrent loops.
Stage 2 – Dual heads – The shared latent representation is fed into two lightweight classification heads:
- Fault‑type head (e.g., line‑to‑ground, line‑to‑line, three‑phase)
- Fault‑location head (20 possible node IDs)
  Both heads use a softmax output and are trained jointly with a weighted cross‑entropy loss, encouraging the encoder to learn features useful for both tasks.
Training & validation – Stratified 10‑fold cross‑validation ensures each fold preserves the distribution of fault types and locations. Early stopping and Adam optimizer with cosine‑annealing learning rate schedule were employed to avoid over‑fitting.

Results & Findings

Metric	FaultXformer	CNN	RNN	LSTM
Fault‑type accuracy	98.76 %	97.06 %	63.81 %	96.72 %
Fault‑location accuracy	98.92 %	88.10 %	58.03 %	92.65 %
Relative improvement (type)	—	+1.70 %	+34.95 %	+2.04 %
Relative improvement (location)	—	+10.82 %	+40.89 %	+6.27 %

The Transformer’s self‑attention consistently captured the subtle waveform distortions caused by DER‑induced variability, which recurrent models struggled with.
Accuracy remained high (> 97 %) across all DER penetration scenarios, indicating robustness to changing power‑flow conditions.
The model required ≈ 2 M parameters, far fewer than the deep CNN baseline, translating to lower inference latency on edge‑grade CPUs (≈ 12 ms per event).

Practical Implications

Faster, more reliable protection schemes – Utilities can replace or augment traditional over‑current relays with a software‑based fault detector that reacts within milliseconds, reducing outage durations.
Reduced sensor footprint – Accurate location identification with only four PMUs means existing PMU deployments can be leveraged for fault diagnostics without costly additional hardware.
DER‑aware grid operation – As solar and storage proliferate, operators need tools that stay accurate under fluctuating injections; FaultXformer’s performance under high DER scenarios makes it a strong candidate for modern microgrid controllers.
Edge deployment – The modest model size and inference speed allow embedding the model in substation PLCs or edge gateways, enabling on‑site decision making without relying on cloud latency.
Integration with AI‑driven DMS – The dual‑head output can feed directly into outage management, crew dispatch, and automated reconfiguration modules, streamlining the whole fault‑response workflow.

Limitations & Future Work

Simulation‑only validation – Results are based on a synthetic IEEE‑13 feeder; real‑world field trials are needed to confirm robustness against measurement noise, GPS timing errors, and communication delays.
Fixed PMU placement – The study assumes four PMUs at predetermined nodes; exploring optimal sensor placement or adaptive selection could further improve accuracy or reduce hardware cost.
Scalability to larger networks – While the model scales linearly with input length, testing on meshed, multi‑feeder systems (e.g., 100+ nodes) is an open challenge.
Explainability – Although attention maps provide some insight, more interpretable diagnostics (e.g., pinpointing the exact waveform segment causing a decision) would aid operator trust.

Overall, FaultXformer demonstrates that modern Transformer architectures can bring a new level of precision to fault detection and localization in increasingly complex, DER‑rich distribution grids.

Authors

Kriti Thakur
Alivelu Manga Parimi
Mayukha Pal

Paper Information

arXiv ID: 2602.24254v1
Categories: eess.SY, cs.AI, cs.LG
Published: February 27, 2026
PDF: Download PDF

[Paper] FaultXformer: A Transformer-Encoder Based Fault Classification and Location Identification model in PMU-Integrated Active Electrical Distribution System

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Mode Seeking meets Mean Seeking for Fast Long Video Generation

[Paper] DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science

[Paper] Do LLMs Benefit From Their Own Words?

[Paper] CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation