The Schengen Area is one of the pillars of the European project. But it has been affected by several difficulties: the serious consequences of the global economic and financial crisis (2008-2018), growing concerns over external migratory pressure and the question of enlargement, fears of social dumping and, since March 2020, the COVID-19 crisis. Identifying these obstacles is vital so that pragmatic solutions can be found without jeopardising the founding principle.
Our working hypothesis is that key factors in COVID-19 imaging are the available imaging data and their label noise and confounders, rather than network architectures per se. Thus, we applied existing state-of-the-art convolution neural network frameworks based on the U-Net architecture, namely nnU-Net , and focused on leveraging the available training data. We did not apply any pre-training nor modi ed the network architecture. First, we enriched training information by generating two additional labels for lung and body area. Lung labels were created with a public available lung segmentation network and weak body labels were generated by thresholding. Subsequently, we trained three di erent multi-class networks: 2-label (original background and lesion labels), 3-label (additional lung label) and 4-label (additional lung and body label). The 3-label obtained the best single network performance in internal cross-validation (Dice-Score 0.756) and on the leaderboard (Dice- Score 0.755, Haussdor 95-Score 57.5). To improve robustness, we created a weighted ensemble of all three models, with calibrated weights to optimise the ranking in Dice-Score. This ensemble achieved a slight performance gain in internal cross-validation (Dice-Score 0.760). On the validation set leaderboard, it improved our Dice-Score to 0.768 and Haussdor 95- Score to 54.8. It ranked 3rd in phase I according to mean Dice-Score. Adding unlabelled data from the public TCIA dataset in a student-teacher manner signi cantly improved our internal validation score (Dice-Score of 0.770). However, we noticed partial overlap between our additional training data (although not human-labelled) and nal test data and therefore submitted the ensemble without additional data, to yield realistic assessments.