[Paper] FaultXformer: A Transformer-Encoder Based Fault Classification and Location Identification model in PMU-Integrated Active Electrical Distribution System
Source: arXiv - 2602.24254v1
Overview
The paper introduces FaultXformer, a Transformer‑encoder model that automatically classifies fault types and pinpoints their locations in active distribution grids equipped with Phasor Measurement Units (PMUs). By leveraging high‑resolution current waveforms, the approach delivers near‑perfect accuracy even when the network hosts many Distributed Energy Resources (DERs), a scenario that traditionally challenges conventional fault‑analysis tools.
Key Contributions
- Dual‑stage Transformer pipeline that first learns rich temporal representations from raw PMU current streams, then separately predicts fault type and fault location.
- Real‑time fault analysis using only four strategically placed PMUs, reducing sensor deployment costs.
- Comprehensive evaluation on a realistic IEEE‑13 node feeder with 20 fault locations and multiple DER penetration levels, employing stratified 10‑fold cross‑validation.
- Performance gains over strong baselines (CNN, RNN, LSTM): up to +34.95 % in fault‑type accuracy and +40.89 % in location accuracy.
- Open‑source‑ready architecture that can be plugged into existing distribution‑management systems (DMS) with minimal code changes.
Methodology
- Data acquisition – Simulated the IEEE‑13 node test feeder in DIgSILENT/PSCAD, injecting faults at 20 distinct nodes while varying DER outputs (solar, storage, etc.). Four PMUs recorded three‑phase currents at 60 Hz, producing time‑series windows of 0.5 s per event.
- Stage 1 – Temporal feature extraction – A standard Transformer encoder (multi‑head self‑attention + feed‑forward layers) ingests the raw current vectors. Self‑attention lets the model weigh critical instants (e.g., the instant of current surge) across the whole window, capturing both short‑ and long‑range dependencies without recurrent loops.
- Stage 2 – Dual heads – The shared latent representation is fed into two lightweight classification heads:
- Fault‑type head (e.g., line‑to‑ground, line‑to‑line, three‑phase)
- Fault‑location head (20 possible node IDs)
Both heads use a softmax output and are trained jointly with a weighted cross‑entropy loss, encouraging the encoder to learn features useful for both tasks.
- Training & validation – Stratified 10‑fold cross‑validation ensures each fold preserves the distribution of fault types and locations. Early stopping and Adam optimizer with cosine‑annealing learning rate schedule were employed to avoid over‑fitting.
Results & Findings
| Metric | FaultXformer | CNN | RNN | LSTM |
|---|---|---|---|---|
| Fault‑type accuracy | 98.76 % | 97.06 % | 63.81 % | 96.72 % |
| Fault‑location accuracy | 98.92 % | 88.10 % | 58.03 % | 92.65 % |
| Relative improvement (type) | — | +1.70 % | +34.95 % | +2.04 % |
| Relative improvement (location) | — | +10.82 % | +40.89 % | +6.27 % |
- The Transformer’s self‑attention consistently captured the subtle waveform distortions caused by DER‑induced variability, which recurrent models struggled with.
- Accuracy remained high (> 97 %) across all DER penetration scenarios, indicating robustness to changing power‑flow conditions.
- The model required ≈ 2 M parameters, far fewer than the deep CNN baseline, translating to lower inference latency on edge‑grade CPUs (≈ 12 ms per event).
Practical Implications
- Faster, more reliable protection schemes – Utilities can replace or augment traditional over‑current relays with a software‑based fault detector that reacts within milliseconds, reducing outage durations.
- Reduced sensor footprint – Accurate location identification with only four PMUs means existing PMU deployments can be leveraged for fault diagnostics without costly additional hardware.
- DER‑aware grid operation – As solar and storage proliferate, operators need tools that stay accurate under fluctuating injections; FaultXformer’s performance under high DER scenarios makes it a strong candidate for modern microgrid controllers.
- Edge deployment – The modest model size and inference speed allow embedding the model in substation PLCs or edge gateways, enabling on‑site decision making without relying on cloud latency.
- Integration with AI‑driven DMS – The dual‑head output can feed directly into outage management, crew dispatch, and automated reconfiguration modules, streamlining the whole fault‑response workflow.
Limitations & Future Work
- Simulation‑only validation – Results are based on a synthetic IEEE‑13 feeder; real‑world field trials are needed to confirm robustness against measurement noise, GPS timing errors, and communication delays.
- Fixed PMU placement – The study assumes four PMUs at predetermined nodes; exploring optimal sensor placement or adaptive selection could further improve accuracy or reduce hardware cost.
- Scalability to larger networks – While the model scales linearly with input length, testing on meshed, multi‑feeder systems (e.g., 100+ nodes) is an open challenge.
- Explainability – Although attention maps provide some insight, more interpretable diagnostics (e.g., pinpointing the exact waveform segment causing a decision) would aid operator trust.
Overall, FaultXformer demonstrates that modern Transformer architectures can bring a new level of precision to fault detection and localization in increasingly complex, DER‑rich distribution grids.
Authors
- Kriti Thakur
- Alivelu Manga Parimi
- Mayukha Pal
Paper Information
- arXiv ID: 2602.24254v1
- Categories: eess.SY, cs.AI, cs.LG
- Published: February 27, 2026
- PDF: Download PDF