Combining Query Reduction and Expansion for Text-Retrieval-Based Bug Localization

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Mar 2021 Singapore Publisher:IEEEJournal:2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)

Authors: Juan Manuel Florez; Oscar Chaparro; Christoph Treude; Andrian Marcus;

doi: 10.1109/saner50967.2021.00024 , 10.5281/zenodo.4431017 , 10.5281/zenodo.4431018

Combining Query Reduction and Expansion for Text-Retrieval-Based Bug Localization

- Summary
- Subjects
- Metrics

Abstract

ABSTRACT Automated text-retrieval-based bug localization (TRBL) techniques normally use the full text of a bug report to formulate a query and retrieve parts of the code that are buggy. Previous research has shown that reducing the size of the query increases the effectiveness of TRBL. On the other hand, researchers also found improvements when expanding the query (i.e., adding more terms). In this paper, we bring these two views together to reformulate queries for TRBL. Specifically, we improve discourse-based query reduction strategies, by adopting a combinatorial approach and using task phrases from bug reports, and combine them with a state-of-the-art query expansion technique, resulting in 970 query reformulation strategies. We investigate the benefits of these strategies for localizing buggy code elements and define a new approach, called QREX , based on the most effective strategy. We evaluated the reformulation strategies, including QREX , on 1,217 queries from different software systems to retrieve buggy code artifacts at three code granularities, using five state-of-the-art automated TRBL approaches. The results indicate that QREX increases TRBL effectiveness by 4% - 12.6%, compared to applying query reduction and expansion in isolation, and by 32.1%, compared to the no-reformulation baseline.

Country

Singapore

Related Organizations

University of Adelaide
Australia
The University of Texas at Dallas
United States
William & Mary
United States
Singapore Management University
Singapore

Keywords

bug localization, query expansion, query reduction, query reformulation, software engineering

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	9
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%