Our working hypothesis is that key factors in COVID-19 imaging are the available imaging data and their label noise and confounders, rather than network architectures per se. Thus, we applied existing state-of-the-art convolution neural network frameworks based on the U-Net architecture, namely nnU-Net , and focused on leveraging the available training data. We did not apply any pre-training nor modi ed the network architecture. First, we enriched training information by generating two additional labels for lung and body area. Lung labels were created with a public available lung segmentation network and weak body labels were generated by thresholding. Subsequently, we trained three di erent multi-class networks: 2-label (original background and lesion labels), 3-label (additional lung label) and 4-label (additional lung and body label). The 3-label obtained the best single network performance in internal cross-validation (Dice-Score 0.756) and on the leaderboard (Dice- Score 0.755, Haussdor 95-Score 57.5). To improve robustness, we created a weighted ensemble of all three models, with calibrated weights to optimise the ranking in Dice-Score. This ensemble achieved a slight performance gain in internal cross-validation (Dice-Score 0.760). On the validation set leaderboard, it improved our Dice-Score to 0.768 and Haussdor 95- Score to 54.8. It ranked 3rd in phase I according to mean Dice-Score. Adding unlabelled data from the public TCIA dataset in a student-teacher manner signi cantly improved our internal validation score (Dice-Score of 0.770). However, we noticed partial overlap between our additional training data (although not human-labelled) and nal test data and therefore submitted the ensemble without additional data, to yield realistic assessments.