MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

Name: MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation
Keywords: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Bo Li 0131; Shaolin Zhu; Lijie Wen 0001

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2024

Data sources: arXiv.org e-Print Archive

https://dx.doi.org/10.48550/ar...

Article . 2024

License: CC BY NC ND

Data sources: Datacite

DBLP

Conference object

Data sources: DBLP

DBLP

Article

Data sources: DBLP

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2024Embargo end date: 01 Jan 2024Publisher:arXivJournal:CoRR, volume abs/2412.07147

Authors: Bo Li 0131; Shaolin Zhu; Lijie Wen 0001;

doi: 10.48550/arxiv.2412.07147

arXiv: 2412.07147

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

- Summary
- Subjects
- Related research
  (3)
- Metrics

Abstract

Image Translation (IT) holds immense potential across diverse domains, enabling the translation of textual content within images into various languages. However, existing datasets often suffer from limitations in scale, diversity, and quality, hindering the development and evaluation of IT models. To address this issue, we introduce MIT-10M, a large-scale parallel corpus of multilingual image translation with over 10M image-text pairs derived from real-world data, which has undergone extensive data cleaning and multilingual translation validation. It contains 840K images in three sizes, 28 categories, tasks with three levels of difficulty and 14 languages image-text pairs, which is a considerable improvement on existing datasets. We conduct extensive experiments to evaluate and train models on MIT-10M. The experimental results clearly indicate that our dataset has higher adaptability when it comes to evaluating the performance of the models in tackling challenging and complex image translation tasks in the real world. Moreover, the performance of the model fine-tuned with MIT-10M has tripled compared to the baseline model, further confirming its superiority.

Accepted in COLING 2025

Related Organizations

Tsinghua University
China (People's Republic of)
Tianjin University
China (People's Republic of)
Hebei University
China (People's Republic of)
Tianjin University
China (People's Republic of)
Tianjin University
China (People's Republic of)

View all View all

Keywords

FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

3 Research products, page 1 of 1

EasyOCR software on GitHub
IsRelatedTo
langid.py software on GitHub
IsRelatedTo
langdetect software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

3 Research products, page 1 of 1

EasyOCR software on GitHub

langid.py software on GitHub

langdetect software on GitHub