ComPtr: Toward Diverse Bi-Source Dense Prediction Tasks via a Simple Yet General Complementary Transformer

Name: ComPtr: Toward Diverse Bi-Source Dense Prediction Tasks via a Simple Yet General Complementary Transformer
Keywords: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Youwei Pang; Xiaoqi Zhao; Lihe Zhang; Huchuan Lu

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2023

Data sources: arXiv.org e-Print Archive

IEEE Transactions on Pattern Analysis and Machine Intelligence

Article . 2025 . Peer-reviewed

License: IEEE Copyright

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2023

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

DBLP

Article

Data sources: DBLP

DBLP

Article

Data sources: DBLP

ComPtr: Toward Diverse Bi-Source Dense Prediction Tasks via a Simple Yet General Complementary Transformer

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Oct 2025Embargo end date: 01 Jan 2023Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 47, pages 8,613-8,629 (issn: 0162-8828, eissn: 1939-3539,

Copyright policy )

Authors: Youwei Pang; Xiaoqi Zhao; Lihe Zhang; Huchuan Lu;

doi: 10.1109/tpami.2025.3578494 , 10.48550/arxiv.2307.12349

arXiv: 2307.12349

ComPtr: Toward Diverse Bi-Source Dense Prediction Tasks via a Simple Yet General Complementary Transformer

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Deep learning (DL) has advanced the field of dense prediction, while gradually dissolving the inherent barriers between different tasks. However, most existing works focus on designing architectures and constructing visual cues only for the specific task, which ignores the potential uniformity introduced by the DL paradigm. In this paper, we attempt to construct a novel $\underline{ComP}$lementary $\underline{tr}$ansformer, $\textbf{ComPtr}$, for diverse bi-source dense prediction tasks. Specifically, unlike existing methods that over-specialize in a single task or a subset of tasks, ComPtr starts from the more general concept of bi-source dense prediction. Based on the basic dependence on information complementarity, we propose consistency enhancement and difference awareness components with which ComPtr can evacuate and collect important visual semantic cues from different image sources for diverse tasks, respectively. ComPtr treats different inputs equally and builds an efficient dense interaction model in the form of sequence-to-sequence on top of the transformer. This task-generic design provides a smooth foundation for constructing the unified model that can simultaneously deal with various bi-source information. In extensive experiments across several representative vision tasks, i.e. remote sensing change detection, RGB-T crowd counting, RGB-D/T salient object detection, and RGB-D semantic segmentation, the proposed method consistently obtains favorable performance. The code will be available at https://github.com/lartpang/ComPtr.

Related Organizations

Dalian Polytechnic University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

1 Research products, page 1 of 1

ComPtr software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

ComPtr: Toward Diverse Bi-Source Dense Prediction Tasks via a Simple Yet General Complementary Transformer

ComPtr: Toward Diverse Bi-Source Dense Prediction Tasks via a Simple Yet General Complementary Transformer

1 Research products, page 1 of 1

ComPtr software on GitHub