[Paper] Explainable Verification of Hierarchical Workflows Mined from Event Logs with Shapley Values

Published: 2 months ago (December 10, 2025 at 06:57 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.09562v1

Overview

The paper introduces a novel pipeline that turns automatically mined hierarchical workflows into formal logical specifications, checks them with automated theorem provers, and then uses Shapley values to explain why a workflow satisfies (or violates) properties such as safety, liveness, or compliance. By marrying process‑mining with game‑theoretic attribution, the authors give developers a concrete way to pinpoint the exact nodes that make a process “good” or “bad”.

Key Contributions

Formal translation of hierarchical process‑tree models (extracted from event logs) into logical formulas suitable for automated reasoning.
Property verification (satisfiability, liveness, safety) using off‑the‑shelf theorem provers, enabling systematic compliance checks.
Shapley‑value attribution adapted to workflow elements, providing a quantitative “explainability” layer that tells you how much each node contributes to a verification outcome.
Empirical validation on standard benchmark logs showing the method can uncover critical bottlenecks, redundant branches, and harmful patterns that traditional mining tools miss.
Prototype toolchain that integrates process mining, theorem proving, and game‑theoretic analysis, demonstrating feasibility for real‑world pipelines.

Methodology

Mining hierarchical workflows – Starting from an event log, a state‑of‑the‑art process‑tree miner (e.g., Inductive Miner) produces a nested tree where internal nodes represent control‑flow operators (sequence, parallel, choice, loop) and leaves correspond to activities.
Logical encoding – Each operator is mapped to a fragment of temporal/first‑order logic (e.g., “A → ◇B” for a sequence). The whole tree becomes a single logical specification.
Automated verification – The specification is fed to an SMT or SAT‑based theorem prover (Z3, CVC5). The prover checks whether user‑defined properties (e.g., “every order eventually reaches Ship”) hold.
Shapley‑value computation – Treat the set of workflow nodes as a cooperative game: the “value” of a coalition is 1 if the property is satisfied when only those nodes are kept, 0 otherwise. By enumerating (or approximating) marginal contributions, each node receives a Shapley score that quantifies its impact on the final verification result.
Interpretation & visualization – Nodes with high positive scores are “protectors” of correctness; high negative scores flag risky or redundant constructs. The authors visualized these scores directly on the process tree.

Results & Findings

On the BPI Challenge logs, the approach identified ≤5 % of nodes as critical for violating a safety property, matching manually discovered bugs but with far less effort.
Redundancy detection: In several logs, parallel branches contributed near‑zero Shapley values, suggesting they could be collapsed without affecting correctness.
Performance: The combined verification + Shapley analysis ran in under 30 seconds for trees with up to 200 nodes, showing scalability for typical enterprise processes.
Explainability: Developers could trace a failed compliance check back to a single loop construct that introduced an unintended deadlock, something that raw event‑log statistics never revealed.

Practical Implications

Compliance automation – Companies can embed the pipeline into CI/CD for business processes, automatically flagging violations of regulatory rules (e.g., GDPR‑related data handling steps).
Process optimization – By highlighting low‑impact or harmful nodes, engineers can refactor workflows, remove dead code‑like branches, and reduce execution time or resource consumption.
Debugging complex orchestrations – Micro‑service orchestrators (BPMN, Camunda, Temporal) often generate large execution graphs; the Shapley‑based attribution gives a “heat map” of which services are responsible for liveness failures.
Tool‑chain integration – The prototype can be wrapped as a REST service, allowing existing process‑mining platforms (e.g., ProM, Celonis) to call out for formal verification and receive explainable scores.
Education & documentation – New team members can understand why a workflow is safe by inspecting the Shapley scores, accelerating onboarding and knowledge transfer.

Limitations & Future Work

Scalability of exact Shapley computation – The exhaustive coalition evaluation grows exponentially; the current implementation relies on Monte‑Carlo sampling, which may introduce variance for very large trees.
Property expressiveness – The logical encoding currently supports a subset of temporal properties; richer specifications (e.g., quantitative time bounds) need extended encodings.
Noise in event logs – Mining errors (missing or spurious events) can propagate into the logical model, potentially skewing Shapley scores; robust preprocessing is an open challenge.
User‑defined property selection – The framework assumes the analyst can formulate the right logical predicates; future work aims to suggest relevant properties automatically based on domain ontologies.

Bottom line: By turning opaque mined workflows into verifiable, explainable artifacts, this research opens a practical path for developers to audit, debug, and improve process‑driven software systems with the same rigor they apply to code.

Authors

Radoslaw Klimek
Jakub Blazowski

Paper Information

arXiv ID: 2512.09562v1
Categories: cs.SE, cs.IT
Published: December 10, 2025
PDF: Download PDF

[Paper] Explainable Verification of Hierarchical Workflows Mined from Event Logs with Shapley Values

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] A Study of Library Usage in Agent-Authored Pull Requests

[Paper] Mini-SFC: A Comprehensive Simulation Framework for Orchestration and Management of Service Function Chains

[Paper] AutoFSM: A Multi-agent Framework for FSM Code Generation with IR and SystemC-Based Testing

[Paper] Visualisation for the CIS benchmark scanning results