EchoRisk-MICCAI: AI for Cardiac Function Estimation, Assessment and Early Prediction of Therapy-Induced Cardiotoxicity from Echocardiography

Authors: Marias, Kostas; Tsiknakis, Manolis; Kalliatakis, Grigorios; Manikis, Georgios; Karanasiou, Georgia; Georga, Eleni; Naka, Katerina; +7 Authors

doi: 10.5281/zenodo.19727928 , 10.5281/zenodo.19727929

EchoRisk-MICCAI: AI for Cardiac Function Estimation, Assessment and Early Prediction of Therapy-Induced Cardiotoxicity from Echocardiography

- Summary
- Subjects
- Metrics

Abstract

This Challenge Proposal for MICCAI 2026 focuses on the automated analysis of echocardiography videos for the early detection and prediction of cancer therapy–induced cardiotoxicity in breast cancer patients. The proposed challenge builds on the CARDIOCARE project, a multidisciplinary, EU-funded initiative focused on improving cardiovascular outcomes in older and multimorbid women undergoing breast cancer therapy. CARDIOCARE has established a prospective, longitudinal clinical study across 6 European hospital sites in 5 countries, systematically collecting real-world echocardiography imaging, standardized clinical metadata and biomarkers of cardiac injury. The study is specifically designed to capture the onset and predict the progression of cardiotoxic effects from anthracyclines and HER2-targeted therapies, two treatments known to induce cardiac dysfunction. The resulting dataset provides a unique and clinically rich foundation for developing and benchmarking AI algorithms that aim to identify early subclinical changes in cardiac function, support risk stratification, and guide advanced personalized monitoring strategies in cardio-oncology. Cardiotoxicity is a critical and growing concern in oncology, representing a major dose-limiting and treatment-interrupting complication of life-saving therapies, such as anthracyclines and HER2-targeted agents. Studies have shown that up to 20–30% of patients receiving anthracyclines and 7–10% receiving trastuzumab develop some form of cardiac dysfunction. Early detection is essential, as even subclinical dysfunction is associated with increased long-term cardiovascular morbidity and mortality, particularly in older and multimorbid women; a population underrepresented in clinical trials but highly vulnerable in real-world settings. Troponin I and NT-proBNP are early surrogate biomarkers of cardiotoxicity, often rising before imaging changes, and supporting early risk stratification. Echocardiography is the standard of care for cardiac monitoring due to its non-invasive nature, widespread availability, real-time imaging capabilities, and cost-effectiveness. However, in clinical workflows, its full potential is hampered by several limitations. First, key cardiac function parameters, such as left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) are typically derived through manual tracing or semi-automated tools, which are highly dependent on operator skill and training, leading to substantial inter- and intra-operator variability, especially in challenging patient anatomies/low-quality acquisitions. Secondly, image quality is affected by body habitus, breathing, and probe angulation, introducing noise and inconsistencies across patients and timepoints. Third, lack of standardization in acquisition protocols across institutions and manufacturers complicates the comparison and interpretation of results in longitudinal monitoring. This variability reflects real-world clinical practice and is intentionally preserved in the dataset to enable evaluation of algorithm robustness across heterogeneous acquisition environments. In multicenter studies or routine clinical practice, these issues result in limited reproducibility, often requiring repeated scans/additional imaging modalities, which may not be accessible in many settings. Moreover, the manual nature of the workflow makes it difficult to scale high-frequency monitoring in vulnerable populations, such as older patients receiving cardiotoxic chemotherapy. As a result, early signs of cardiac dysfunction are often missed, underestimated, or identified too late to prevent irreversible damage. The MICCAI community has made major contributions to medical image segmentation, cardiac function estimation, and prognostic modeling. Despite this progress, robust and deployable AI models for automated cardiac function assessment and risk prediction in echocardiography remain scarce due to the lack of standardized, large-scale, and clinically grounded benchmark specifically addressing the assessment and prediction of therapy-induced cardiotoxicity from echocardiography videos. This challenge would fill that gap by providing the first curated, multicenter dataset tailored for evaluating AI models on tasks directly tied to clinical decision-making in cardio-oncology. The CARDIOCARE echocardiography dataset is the first imaging database to incorporate explicit information on cardiotoxicity, including labels directly related to therapy-induced cardiotoxicity. The challenge will encourage the development of algorithms that are not only accurate, but robust, fair, interpretable, and capable of generalizing across different data sources, consistent with MICCAI’s 2026 priorities. It will also facilitate comparisons between deep learning methods, Foundational models and traditional image analysis techniques, and hybrid approaches that leverage both imaging and clinical metadata. Dataset Description. The challenge will use imaging data from CARDIOCARE prospective clinical study that includes echocardiography data from 6 major European cancer and cardiology centers across 5 countries; IEO (Italy), BOCOC (Cyprus), KSBC (Sweden), NKUA (Greece), and UOI (Greece) and IOL (Slovenia), collectively contributing over 421 patients. Data Types. The dataset will consist of: (i) 2D echocardiographic sequences (grayscale DICOM videos capturing at least one full representative cardiac cycle) extracted for two standard views: apical 4-chamber and 2-chamber in multiple time points. (ii) Imaging collected over time, including baseline, early-treatment, and follow-up scans. (iii) Blood biomarkers related to cardiotoxicity (e.g., elevated troponin and/or NT-proBNP). (iv) Global longitudinal strain (GLS). Dataset Characteristics. (i) Multicenter, multi-vendor acquisition, with substantial variability in ultrasound equipment and operator technique, enabling thorough robustness evaluation. (ii) Real-world clinical quality, featuring artifacts, heterogeneous framing, variable image quality, and occasional missing views, representative of routine cardiology workflows. (iii) GDPR-compliant, fully de-identified imaging, curated according to established DICOM anonymization standards. (iv) Adequate sample size (data from 421 patients across multiple timepoints; 3-month follow-ups resulting in a total number of 1528 echocardiography videos) to support AI model development. Proposed Challenge Tasks. To align with MICCAI’s emphasis on clinical translation, fairness, and robustness, we propose three tasks. The three tasks are intentionally defined using distinct prediction paradigms: Task 1 focuses on regression-based estimation of quantitative cardiac parameters and biomarkers, whereas Tasks 2 and 3 address classification and risk prediction problems producing calibrated probabilities of clinically defined outcomes. (i) Task 1: Cardiac parameters and biomarkers estimation from echocardiograms. Participants will develop AI algorithms using data from 421 patients (1528 echocardiography videos, across 5 different follow-up timepoints) to estimate key cardiac parameters and biomarkers directly from echocardiography videos. Expected outputs include LV end-diastolic and end-systolic volumes, ejection fraction, and blood biomarkers related to cardiotoxicity, i.e., elevated troponin and elevated NT-proBNP (when available). This task may involve segmentation, tracking, and video-based regression and can be partially aligned with EchoNet Dynamic data (https://echonet.github.io/dynamic/index.html ). (ii) Task 2: Assessment of LV cardiac dysfunction (as defined by clinically accepted EF and GLS thresholds). Participants will develop predictive models that identify patients with cardiac dysfunction during or after therapy using data from 421 patients (1528 echocardiography videos, across 5 different follow-up timepoints). Participants are required to output a calibrated probability of LV dysfunction per examination. Ranking will be based on AUC-ROC as the primary metric, with balanced accuracy used as a tie-breaker. Sensitivity at a fixed specificity (90%) will be reported as a secondary clinically interpretable metric. (iii) Task 3: Early Prediction of Therapy-Induced Cardiotoxicity. Participants will develop predictive models that predict cardiotoxicity from baseline echocardiography videos at a future time point using data from 254 patients. Participants must output a baseline cardiotoxicity risk probability. Ranking will be based on AUC-ROC as the primary metric, with balanced accuracy and sensitivity used as tie-breakers. For contextualization, at least one literature-based reference baseline will be defined per task (e.g., EchoNet-style video models for functional estimation and ICOS-based risk modeling for cardiotoxicity prediction). More specifically, Task 1 will include an EchoNet-style reference baseline for functional estimation, Task 2 a guideline-aligned LVEF <50% dysfunction baseline, and Task 3 an HFA–ICOS-style clinical risk baseline. Each task has its own evaluation protocol, statistical analysis plan, leaderboard, and awards, and is evaluated independently. No aggregated cross-task ranking will be computed. This structure allows focused methodological contributions while ensuring fair, transparent, and task-specific evaluation. Cancer therapy–related cardiotoxicity remains a major cause of morbidity and treatment interruption in oncology. Although echocardiography is routinely used for monitoring, clinical decisions are constrained by measurement variability and delayed recognition of subclinical dysfunction. The EchoRisk-MICCAI challenge aligns algorithm evaluation with clinically actionable thresholds rather than abstract metrics alone while retaining standard benchmarking metrics for fair comparison. For functional estimation, performance is interpreted within guideline-informed margins. Absolute LVEF errors within 5 percentage points are considered clinically acceptable, while deviations beyond 10 percentage points may alter management around key cut-offs. These thresholds are consistent with commonly reported inter-observer variability in routine echocardiography and therefore provide clinically interpretable reference points for algorithm evaluation. A relative GLS reduction greater than 15 percent is treated as clinically meaningful. For dysfunction detection and early risk prediction, evaluation extends beyond AUC to clinically relevant operating points, prioritizing sensitivity to minimize missed high-risk patients. By embedding decision-oriented thresholds and real-world variability into evaluation, the challenge benchmarks models according to their potential to influence clinical management and prevent irreversible cardiac dysfunction. The dataset is being collected under approved ethics protocols of the CARDIOCARE project consortium, following: (i) fully anonymized processes according to GDPR and institutional guidelines and, (ii) adequate processing to remove identifiable metadata. Dataset release to challenge participants will occur only after completion of the required secondary-use approvals from participating institutions. The final dataset release plan will follow MICCAI standards, including training/validation/testing partitions with secure test-set handling. All reference measurements are derived from routine clinical workflows across multiple centers and vendors and therefore reflect real-world acquisition and measurement variability rather than idealized research-grade annotations. For training purposes, we encourage participants to use other public datasets and echocardiography Foundational models. Participants using external public datasets or foundation models for pretraining or model development will be required to disclose them, and external retrieval or internet access during inference will not be permitted. Organizing Team Qualifications. The organizing team combines expertise in: (i) Cardiovascular imaging, (ii) Oncology and cardio-oncology clinical practice, (iii) Medical AI research, including segmentation, video analysis, prognostic modeling, multimodal learning, and Foundational models, (iv) Responsible AI, robustness evaluation, and fairness frameworks and, (v) Large-scale challenge organization and data curation, to ensure the challenge is scientifically rigorous and clinically aligned.

Related Organizations

View all View all

Keywords

Therapy-induced cardiotoxicity, Real-world clinical data, Cardiac function assessment, Echocardiography video analysis, MICCAI 2026 challenge

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average