Ultralytics YOLO

🌟 Summary Ultralytics v8.4.40 introduces per-image precision/recall/F1 tracking during validation (led by PR #24089 from @Laughing-q), making it much easier to see exactly which images your model handles well or poorly. 📈🖼️ 📊 Key Changes New per-image validation metrics added to results: precision, recall, f1, tp, fp, fn for each image. Exposed via metrics.box.image_metrics (and also for seg and pose where applicable). ✅ Detection validation pipeline updated to store image name and compute image-level stats consistently with validation matching logic. 🔍 Distributed (multi-GPU) validation support now gathers and merges image_metrics correctly across ranks, so results remain complete in larger training setups. 🧠⚙️ Metrics classes extended with: image_metrics storage update helpers clear/reset helpers to prevent stale metrics between runs. Docs updated across validation/task guides (detect, segment, pose, OBB, insights, custom trainer) with examples showing how to access per-image metrics. 📚 Version bump: 8.4.39 ➜ 8.4.40 🚀 🎯 Purpose & Impact Faster debugging of weak samples: You can now pinpoint problematic images directly instead of relying only on dataset-wide averages. 🎯 Better dataset curation: Find images causing high false positives/false negatives and decide whether to relabel, augment, or rebalance. 🧹 More actionable model evaluation: Teams get practical, image-level insight for error analysis and iterative improvement. 🔁 Reliable at scale: Works cleanly in multi-GPU validation, so enterprise and research workflows benefit too. 🏗️ Broad usability: Useful for both beginners and advanced users working with YOLO models, especially YOLO26 validation workflows. 🤝 What's Changed ultralytics 8.4.40 Per-image Precision and Recall by @Laughing-q in https://github.com/ultralytics/ultralytics/pull/24089 Full Changelog: https://github.com/ultralytics/ultralytics/compare/v8.4.39...v8.4.40

Found an issue? Give us feedback