Research method in AI - Reproducibility of results

Reproducibility of published computational research has seen increased interest the last twenty years. Regardless of academic field and the impact-factor of journals, studies of reproducibility of computational research have found low rates of reproducibility. Common issues relate to the availability of source code and data, even when original authors attempt to reproduce their own published research. In this thesis, we investigate the state of reproducibility in artificial intelligence research. The objective is not to reproduce experiments, but to investigate and quantify the state of reproducibility in artificial intelligence research. Two hypotheses were investigated: 1) Documentation of AI research is not good enough to reproduce results, and 2) Documentation practices have improved in recent years. 400 research papers from two instalments of two top AI conference series, IJCAI and AAAI, have been surveyed to investigate the hypotheses. The results of our survey support the first hypothesis, but not the second. While common usage of public datasets is widespread, sharing of code is lagging behind. Facilitating sharing of source code, and data without disrupting the peer review process are necessary to improve the situation. The contribution efforts of the research in this thesis are: (i) a survey design for evaluating documentation of published papers, (ii) an evaluation of two leading AI conference series, and (iii) suggested incentives to facilitate the reproducibility of AI research.

Keywords

Datateknologi, Kunstig intelligens

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green