<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Challenge Results are not Reproducible

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Article , Preprint 01 Jan 2023Embargo end date: 01 Jan 2023Publisher:Springer Fachmedien Wiesbaden

Authors: Reinke, Annika; Grab, Georg; Maier-Hein, Lena;

doi: 10.1007/978-3-658-41657-7_43 , 10.48550/arxiv.2307.07226

arXiv: http://arxiv.org/abs/2307.07226

Challenge Results are not Reproducible

- Summary
- Subjects
- Metrics

Abstract

While clinical trials are the state-of-the-art methods to assess the effect of new medication in a comparative manner, benchmarking in the field of medical image analysis is performed by so-called challenges. Recently, comprehensive analysis of multiple biomedical image analysis challenges revealed large discrepancies between the impact of challenges and quality control of the design and reporting standard. This work aims to follow up on these results and attempts to address the specific question of the reproducibility of the participants methods. In an effort to determine whether alternative interpretations of the method description may change the challenge ranking, we reproduced the algorithms submitted to the 2019 Robust Medical Image Segmentation Challenge (ROBUST-MIS). The leaderboard differed substantially between the original challenge and reimplementation, indicating that challenge rankings may not be sufficiently reproducible.

Accepted at BVM 2023

Related Organizations

Helmholtz Association of German Research Centres
Germany
German Cancer Research Center
Germany

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Beta

SDGs Suggest

3. Good health

Beta

SDGs:

3. Good health,

Related to Research communities

Knowmad Institut