publication . Article . Other literature type . 2013

Data Quality: Some Comments on the NASA Software Defect Datasets

Shepperd, Martin; Song, Qinbao; Sun, Zhongbin; Mair, Carolyn;
Open Access English
  • Published: 01 Sep 2013
  • Publisher: Institute of Electrical and Electronics Engineers
  • Country: United Kingdom
Abstract
Background--Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective--This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method--We analyze the five studies published in the IEEE Transactions on Software Engin...
Subjects
free text keywords: Empirical software engineering, Data quality, Defect prediction, Machine learning, Software, Knowledge representation and reasoning, Computer science, Empirical process (process control model), Software quality, Preprocessor, Software bug, business.industry, business, Theoretical computer science, Software modules, Data mining, computer.software_genre, computer, Information retrieval
Related Organizations
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Article . Other literature type . 2013

Data Quality: Some Comments on the NASA Software Defect Datasets

Shepperd, Martin; Song, Qinbao; Sun, Zhongbin; Mair, Carolyn;