Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2023
License: CC BY
Data sources: ZENODO
ZENODO
Software . 2023
License: CC BY
Data sources: Datacite
ZENODO
Software . 2023
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Text-to-text Generation for Issue Report Classification

Authors: Rejithkumar, Gokul; Rose Anish, Preethu; Ghaisas, Smita;

Text-to-text Generation for Issue Report Classification

Abstract

Submission for the NLBSE Issue Report Tool Competition This package accompanies the submission titled "Text-to-text Generation for Issue Report Classification" to the NLBSE Issue Report Tool Competition. The package provides resources for replicating the experiments and results presented. Description of ZIP Files: issue_classification_t5: This archive contains the code for replicating the study, including the retrieval of the pre-trained model, fine-tuning procedures, and inference execution. code: Contains all the code files. finetuning.py: The contents of this file comprise the code for fine-tuning the VMware/flan-t5-large-alpaca model on the issue report classification task. Additionally, embedded comments provide guidance on executing the fine-tuning process. Be sure to read the embedded comments. inference.py: This file contains the codebase for conducting inference using the fine-tuned model. Similar to the fine-tuning script, instructions for running the inference process are embedded as comments within the file. download_plm.py: This file contains the code for downloading VMware/flan-t5-large-alpaca from https://huggingface.co/VMware/flan-t5-large-alpaca . requirements.txt: This file enumerates the required Python modules and their respective versions necessary for the successful execution of the provided code. data: Folder contains the NLBSE issue report classification data and model output after running inference using inference.py on issue-report-test.csv checkpoint-3000-output.csv: The contents of this CSV file present the output obtained after fine-tuning the VMware/flan-t5-large-alpaca model for 2 epochs (F1-score of 0.8297) on issue-report-train.csv and running the inference on issue-report-test.csv. Column 'label' contains the ground truth labels. Column 'Model generated output' contains the predicted label by the model. issue-report-train.csv: NLBSE24 isssue report tool competition train dataset. (Source: https://github.com/nlbse2024/issue-report-classification) issue-report-test.csv: NLBSE24 isssue report tool competition test dataset. (Source: https://github.com/nlbse2024/issue-report-classification) finetuned_model_checkpoint-3000: This zip file contains the fine-tuned model (VMware/flan-t5-large-alpaca) to 2 epochs.

Environment details: Operating System: Ubuntu 22.04 NVIDIA Driver Version: 470.141.03 NVIDIA CUDA Version: 12.2.1 Python version: 3.10 GPU Name: Nvidia A100 GPU Memory: 20 GiB CPU Memory: 60 GiB Note: We also attempted fine-tuning using a V100 GPU, and the results showed slight differences, potentially attributed to variations in GPU architecture. However, running inference on any GPU using the provided model finetuned_model_checkpoint-3000 should yield the same results as reported.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average