Replication Package for: Enhancing Task Prioritization in Software Development Issues Tracking System

Priority_RQ1toRQ3/ (Folder containing files for RQ1-RQ3) .vscode/ (VSCode editor specific settings - optional) extra_code/ (Supplementary code or notebooks) jira/ (Scripts or data related to Jira dataset processing/analysis) priority/ (Scripts or notebooks for priority classification tasks) priority_model_deberta/ (Scripts/notebooks specific to DeBERTa model experiments for RQ1-RQ3) scripts_shared/ (Shared utility scripts) .gitignore (Git ignore file) github_script.sh (Shell script for running RQ1-RQ3 experiments, e.g., Papermill) README.md (README specific to RQ1-RQ3 components) requirements_local.txt (Python dependencies for local execution) requirements_ml_nodes.txt (Python dependencies for HPC/ML Nodes environment) 01_train_high_vs_med_low_top50_hp.ipynb (Example Jupyter notebook for RQ1-RQ3 training) High_priority_llm_classification.ipynb (Jupyter notebook for LLM classification - High Priority - RQ5) Low_priority_llm_classification.ipynb (Jupyter notebook for LLM classification - Low Priority - RQ5) Medium_priority_llm_classification.ipynb (Jupyter notebook for LLM classification - Medium Priority - RQ5) ModernBERT.ipynb (Jupyter notebook for ModernBERT analysis - RQ4)

This Zenodo record provides the code, scripts, notebooks, and links to datasets and models used to support the findings for Research Questions 1 through 5 (RQ1-RQ5) in the paper "Enhancing Task Prioritization in Software Development Issues Tracking system." Modern software development faces a critical bottleneck in manually prioritizing issues. This paper investigates automated issue priority classification using Transformer models. We evaluate models like BERT, DeBERTa, and a specialized ModernBERT, comparing them against general Large Language Models (LLMs) such as Qwen2.5-3B and Llama-3.2-3B, using curated datasets from Jira and GitHub. This package contains the necessary components to replicate the in-distribution classification (RQ1), out-of-distribution generalization (RQ2), fine-tuning impact assessments (RQ3), detailed performance analysis of ModernBERT across priority levels (RQ4), and the comparative performance of LLMs against ModernBERT (RQ5). The main paper demonstrates that Transformer models, particularly ModernBERT, achieve high classification performance (e.g., accuracy > 81%, AUC > 0.90, MCC > 0.62), significantly outperforming the evaluated general LLMs for this task.

Related Hugging Face Repositories Hugging Face Model Repositories (Source of pre-trained models/fine-tuned models used/evaluated in the paper): karths/High_priority_roberta karths/low_priority_roberta karths/medium_priority_roberta karths/modernbertbase-binary-10IQR-medium-priority karths/modernbertbase-binary-10IQR-low-priority karths/modernbertbase-binary-10IQR-high-priority Hugging Face Dataset Repositories (Source of datasets used/generated as per paper contribution): datasets/karths/binary-10IQR-priority datasets/karths/binary-10IQR-high-priority datasets/karths/binary-10IQR-medium-priority datasets/karths/binary-10IQR-low-priority datasets/karths/binary-10IQR-high-priority-mix

Related Organizations

University of Oslo
Norway

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average