[Paper] Bug Priority Change Prediction: An Exploratory Study on Apache Software

Published: 2 months ago (December 9, 2025 at 07:59 PM EST)

3 min read

Source: arXiv

Source: arXiv - 2512.09216v1

Overview

This paper tackles a surprisingly overlooked problem in modern issue‑tracking: predicting when a bug’s priority will change during its lifecycle. By mining data from 32 Apache projects, the authors show that features derived from the bug‑fixing process—combined with smart handling of class imbalance—can forecast priority shifts with promising accuracy, opening the door to more proactive triage and resource allocation.

Key Contributions

Two‑phase prediction framework that separates the bug reporting and bug fixing stages, training dedicated models for each.
Bug‑fixing evolution features (e.g., comment frequency, developer activity, code change metrics) that capture how a bug’s context evolves over time.
Class‑imbalance mitigation strategy (oversampling + cost‑sensitive learning) tailored to the heavily skewed distribution of priority changes.
Extensive empirical evaluation on a curated dataset of > 200 k bug reports from 32 Apache projects, reporting F1‑scores up to 0.80.
Cross‑project analysis revealing how well models trained on one project transfer to others and how performance varies across priority levels.

Methodology

Data Collection & Labeling
- Extracted bug reports from Apache JIRA, focusing on non‑trivial projects (e.g., Hadoop, Spark).
- Each bug was labeled “priority changed” or “unchanged” based on the history of its priority field.
Lifecycle Segmentation
- Reporting Phase: From bug creation until the first status change (e.g., Open → In Progress).
- Fixing Phase: From the first status change until the bug is resolved/closed.
Feature Engineering
- Static attributes: initial priority, severity, component, reporter reputation.
- Dynamic evolution attributes: number of comments, time between comments, number of developers involved, lines of code changed, test coverage impact, etc.
Handling Class Imbalance
- Applied SMOTE (Synthetic Minority Over‑sampling Technique) to generate synthetic “priority‑change” instances.
- Integrated cost‑sensitive classifiers that penalize misclassifying the minority class more heavily.
Model Training & Evaluation
- Tested several algorithms (Random Forest, XGBoost, Logistic Regression).
- Used stratified 10‑fold cross‑validation, reporting F1‑score, F1‑weighted, and F1‑macro.
Cross‑Project & Priority‑Level Experiments
- Trained on one project, tested on another to gauge generalizability.
- Analyzed performance per priority tier (e.g., P1‑Critical vs. P4‑Low).

Results & Findings

Phase	Metric	Score
Reporting	F1 (binary)	0.798
Fixing	F1‑weighted	0.712
Fixing	F1‑macro	0.613

Bug‑fixing evolution features consistently outperformed baseline models that used only static attributes.
The imbalance handling strategy contributed an average lift of ~6 % in F1 across both phases.
Cross‑project transfer: While absolute scores dropped when applying a model to a different project, weighted F1 stayed above 0.60 for most pairs, indicating reasonable portability.
Priority‑level robustness: Prediction quality remained relatively stable across P1‑P4, suggesting the approach is not biased toward high‑severity bugs.

Practical Implications

Automated Triage Assistants: Integrate the model into JIRA or GitHub Issues to flag bugs likely to need a priority bump, prompting early review by project managers.
Resource Planning: Teams can anticipate spikes in high‑priority work, adjusting sprint capacity or allocating on‑call engineers proactively.
Reduced Human Bias: By providing data‑driven suggestions, the system mitigates subjective over‑ or under‑prioritization that often stems from “triage fatigue.”
Cross‑Project Knowledge Sharing: Open‑source foundations can seed new projects with pre‑trained models, accelerating effective bug management without extensive local data collection.

Limitations & Future Work

Dataset Scope: The study focuses on Apache projects; results may differ for commercial or smaller‑scale repositories with different workflow conventions.
Feature Freshness: Some evolution features (e.g., comment velocity) require real‑time updates, which could be costly to compute at scale.
Granular Priority Changes: The binary “changed vs. unchanged” label ignores the direction (e.g., upgrade vs. downgrade) and magnitude of the shift.
Future Directions:
- Extend to multiclass prediction (predict exact new priority).
- Explore deep‑learning sequence models (e.g., Transformers) to capture richer temporal patterns.
- Conduct user studies to assess how developers interact with priority‑change recommendations in practice.

Authors

Guangzong Cai
Zengyang Li
Peng Liang
Ran Mo
Hui Liu
Yutao Ma

Paper Information

arXiv ID: 2512.09216v1
Categories: cs.SE
Published: December 10, 2025
PDF: Download PDF

[Paper] Bug Priority Change Prediction: An Exploratory Study on Apache Software

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] A Study of Library Usage in Agent-Authored Pull Requests

[Paper] Mini-SFC: A Comprehensive Simulation Framework for Orchestration and Management of Service Function Chains

[Paper] AutoFSM: A Multi-agent Framework for FSM Code Generation with IR and SystemC-Based Testing

[Paper] Visualisation for the CIS benchmark scanning results