
📢 This dataset is published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS), 2025.🔗 IEEE Xplore Link📄 DOI: 10.1109/JSTARS.2025.3600613 Abstract: Existing remote sensing change captioning (RSICC) methods often fail under challenges like illumination differences, viewpoint changes, and blur effects, leading to inaccuracies, especially in no-change regions. Moreover, images acquired at different spatial resolutions and with registration errors tend to affect the captions. To address these issues, we introduce SECOND-CC, a novel RSICC dataset featuring high-resolution RGB image pairs, semantic segmentation maps, and diverse real-world scenarios. SECOND-CC contains 6 041 pairs of bitemporal remote sensing images and 30 205 sentences describing the differences between the images. Additionally, we propose MModalCC, a multimodal framework that integrates semantic and visual data using advanced attention mechanisms, including Cross-Modal Cross Attention and Multimodal Gated Cross Attention. In addition, we adapt MModalCC to handle noisy semantic inputs by integrating a Semantic Change Detector, improving its robustness for real-world applications. Detailed ablation studies and attention visualizations further demonstrate its effectiveness and ability to address the challenges of RSICC. Comprehensive experiments show that MModalCC outperforms state-of-the-art RSICC methods, including RSICCformer, Chg2Cap, and PSNet with +4.6% improvement on BLEU4 score and +9.6% improvement on CIDEr score in SECOND-CC dataset. MModalCC was further validated on the LEVIR-MCI benchmark, where it achieved an average S∗m score of 83.51, significantly outperforming previous state-of-the-art methods. We will make our dataset and codebase publicly available to facilitate future research at https://github.com/ChangeCapsInRS/SecondCC.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
