<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Supplementary material for 'The MAP metric in Information Retrieval Fault Localization'

Research datakeyboard_double_arrow_right Dataset 11 Apr 2023Publisher:Zenodo

Authors: Hirsch, Thomas; Hofer, Birgit;

doi: 10.5281/zenodo.7817015 , 10.5281/zenodo.7817016

Supplementary material for 'The MAP metric in Information Retrieval Fault Localization'

- Summary
- Metrics

Abstract

# map_bench4bl This is the supplementary material, data, and evaluation source code for the paper "The MAP metric in Information Retrieval Fault Localization" by Thomas Hirsch and Birgit Hofer. ## Preliminaries ### Python environment - Python 3.8 - pandas - numpy - matplotlib ## Datasets The [Bench4BL](https://github.com/exatoa/Bench4BL) dataset has been used in this evaluation, with the addition of intermediate files taken from the [SABL](http://dx.doi.org/10.5281/zenodo.4681242) experiment performed on this Bench4BL dataset. All data used in our evaluation is included in this repository. However, if the data is to be re-imported directly from these benchmark and datasets they have to be downloaded first and their local paths have to be set in [paths.py](paths.py). ### Bench4BL The Bench4BL dataset was published with the paper "Bench4BL: Reproducibility study on the performance of IR-based bug localization" by Lee, J., Kim, D., Bissyandé, T.F., Jung, W. and Le Traon, Y.. The dataset can be obtained [here](https://github.com/exatoa/Bench4BL). Follow the steps described in the corresponding [README](https://github.com/exatoa/Bench4BL/blob/master/README.md) to set up the dataset. The Bench4BL dataset contains the _old subjects_ subdataset, containing 558 bugs from AspectJ, JDT, PDE, SWT, and ZXing that have been widely used in older IRFL studies. This _old subjects_ subdataset was used in answering our RQ1, as discussed below, the corresponding scripts use _old subjects_ in their name to highlight this. #### SABL The SABL dataset is the online appendix of the paper "An Extensive Study of Smell-Aware Bug Localization" by TTakahashi, A., Sae-Lim, N., Hayashi, S. and Saeki, M.. The dataset can be downloaded [here](http://dx.doi.org/10.5281/zenodo.4681242). The experiments in this dataset build on top of Bench4BL and intermediate files are provided in the datapackage. #### Rankings Rankings for BLIA, BRTracer, and BugLocator were produced by running these tools on Bench4BL locally. Rankings for AmaLgam and BLUiR were taken from the SABL experiment dataset. ## Structure ### Folders Bench4BL ground truths: - bench4bl_old_subjects_summary - bench4bl_summary Localization results of the included tools in Bench4BL: - bench4bl_localization_results - bench4bl_localization_results_sabl Target projects size metrics: - cloc_results - cloc_results_old_subjects Utility functions: - utils Output folders containing results, generated figures and tables: - results - results_old_subjects ### Scripts Scripts for re-importing data from Bench4BL and SABL datasets: - data_preparation_step_1_cloc_bench4bl.py - data_preparation_step_1_cloc_old_subjects_bench4bl.py - data_preparation_step_2_import_ground_truth_from_bench4bl.py - data_preparation_step_2_import_ground_truth_from_old_subjects_bench4bl.py - data_preparation_step_3_import_bench4bl_ranking_results.py - data_preparation_step_3_import_sabl_ranking_results.py Utilities: - paths.py - utils/bench4bl_utils.py - utils/Logger.py ### Evaluation scripts for the corresponding research questions: **Dataset analysis:** - rq_0_dataset_analysis_bench4bl_issues.py **RQ1: How big is the average ground truth in Bench4BL datasets, and what proportion of bugs have a ground truth containing multiple files?** - rq_1_bench4bl_ground_truth_size.py - rq_1_old_subjects_bench4bl_ground_truth_size.py RQ2: Do the IRFL tools included in Bench4BL truncate their results? - rq_2_ranking_lengths.py **RQ3: How strong is $AP_{asrd}$ overestimating $AP_{mb}$ for truncated BugLocator retrieval results on the Bench4BL dataset? RQ3a: How strong is $AP_{asrd}$ overestimating $AP_{mb}$ for truncated BugLocator retrieval results when considering the bloated ground truth issue found in Bench4BL?** - rq_3_truncating_BugLocator_rankings_bench4bl.py **RQ3b: How strong is $AP_{asrd}$ overestimating $AP_{mb}$ for truncated BugLocator retrieval results when undefined $AP$ values are simply ignored?** - rq_3b_undefined_ap_BugLocator_rankings_bench4bl.py ## Licence All code and results are licensed under [CCA v4](https://creativecommons.org/licenses/by/4.0/), according to LICENSE file. Other licences may apply for some tools and datasets contained in this repo: [cloc-1.92.pl](https://github.com/AlDanial/cloc) under GPL v2, [Bench4BL](https://github.com/exatoa/Bench4BL) and [SABL](http://dx.doi.org/10.5281/zenodo.4681242) under CCA 4.0.

Related Organizations

Graz University of Technology
Austria

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	20
download	downloads	3

20
views
3
downloads
Powered by

Found an issue? Give us feedback

visibility

download

Average