publication . Preprint . 2018

Scene Text Detection and Recognition: The Deep Learning Era

Long, Shangbang; He, Xin; Yao, Cong;
Open Access English
  • Published: 10 Nov 2018
Abstract
With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, approach and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight rec...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition
Download from
188 references, page 1 of 13

[1] Icdar 2015 robust reading competition (presentation). http://rrc. cvc.uab.es/files/Robust Reading 2015 v02.pdf. Accessed: 2018- 07-30.

[2] Project gutenberg for ditigizing books. https://www.gutenberg. org. Accessed: 2018-08-08.

[3] Screen reader. https://en.wikipedia.org/wiki/Screen reader# cite note-Braille display-2. Accessed: 2018-08-09.

[4] Jon Almaza´n, Albert Gordo, Alicia Forne´s, and Ernest Valveny. Word spotting and recognition with embedded attributes. IEEE transactions on pattern analysis and machine intelligence, 36(12):2552-2566, 2014.

[5] Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 33(5):898-916, 2011.

[6] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. ICLR 2015, 2014.

[7] Fan Bai, Zhanzhan Cheng, Yi Niu, Shiliang Pu, and Shuigeng Zhou. Edit probability for scene text recognition. In CVPR 2018, 2018. [OpenAIRE]

[8] Christian Bartz, Haojin Yang, and Christoph Meinel. See: Towards semi-supervised end-to-end scene text recognition. arXiv preprint arXiv:1712.05404, 2017.

[9] Alessandro Bissacco, Mark Cummins, Yuval Netzer, and Hartmut Neven. Photoocr: Reading text in uncontrolled conditions. In Proceedings of the IEEE International Conference on Computer Vision, pages 785-792, 2013. [OpenAIRE]

[10] Fedor Borisyuk, Albert Gordo, and Viswanath Sivakumar. Rosetta: Large scale system for text detection and recognition in images. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 71-79. ACM, 2018.

[11] Denny Britz, Anna Goldie, Minh-Thang Luong, and Quoc Le. Massive exploration of neural machine translation architectures. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1442-1451, 2017.

[12] Michal Busta, Lukas Neumann, and Jiri Matas. Fastext: Efficient unconstrained scene text detector. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1206- 1214, 2015.

[13] Michal Busta, Lukas Neumann, and Jiri Matas. Deep textspotter: An end-to-end trainable scene text localization and recognition framework. In Proc. ICCV, 2017. [OpenAIRE]

[14] Xilin Chen, Jie Yang, Jing Zhang, and Alex Waibel. Automatic detection and recognition of signs from natural scenes. IEEE Transactions on image processing, 13(1):87-99, 2004.

[15] Zhanzhan Cheng, Fan Bai, Yunlu Xu, Gang Zheng, Shiliang Pu, and Shuigeng Zhou. Focusing attention: Towards accurate text recognition in natural images. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 5086-5094. IEEE, 2017. [OpenAIRE]

188 references, page 1 of 13
Abstract
With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, approach and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight rec...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition
Download from
188 references, page 1 of 13

[1] Icdar 2015 robust reading competition (presentation). http://rrc. cvc.uab.es/files/Robust Reading 2015 v02.pdf. Accessed: 2018- 07-30.

[2] Project gutenberg for ditigizing books. https://www.gutenberg. org. Accessed: 2018-08-08.

[3] Screen reader. https://en.wikipedia.org/wiki/Screen reader# cite note-Braille display-2. Accessed: 2018-08-09.

[4] Jon Almaza´n, Albert Gordo, Alicia Forne´s, and Ernest Valveny. Word spotting and recognition with embedded attributes. IEEE transactions on pattern analysis and machine intelligence, 36(12):2552-2566, 2014.

[5] Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 33(5):898-916, 2011.

[6] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. ICLR 2015, 2014.

[7] Fan Bai, Zhanzhan Cheng, Yi Niu, Shiliang Pu, and Shuigeng Zhou. Edit probability for scene text recognition. In CVPR 2018, 2018. [OpenAIRE]

[8] Christian Bartz, Haojin Yang, and Christoph Meinel. See: Towards semi-supervised end-to-end scene text recognition. arXiv preprint arXiv:1712.05404, 2017.

[9] Alessandro Bissacco, Mark Cummins, Yuval Netzer, and Hartmut Neven. Photoocr: Reading text in uncontrolled conditions. In Proceedings of the IEEE International Conference on Computer Vision, pages 785-792, 2013. [OpenAIRE]

[10] Fedor Borisyuk, Albert Gordo, and Viswanath Sivakumar. Rosetta: Large scale system for text detection and recognition in images. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 71-79. ACM, 2018.

[11] Denny Britz, Anna Goldie, Minh-Thang Luong, and Quoc Le. Massive exploration of neural machine translation architectures. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1442-1451, 2017.

[12] Michal Busta, Lukas Neumann, and Jiri Matas. Fastext: Efficient unconstrained scene text detector. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1206- 1214, 2015.

[13] Michal Busta, Lukas Neumann, and Jiri Matas. Deep textspotter: An end-to-end trainable scene text localization and recognition framework. In Proc. ICCV, 2017. [OpenAIRE]

[14] Xilin Chen, Jie Yang, Jing Zhang, and Alex Waibel. Automatic detection and recognition of signs from natural scenes. IEEE Transactions on image processing, 13(1):87-99, 2004.

[15] Zhanzhan Cheng, Fan Bai, Yunlu Xu, Gang Zheng, Shiliang Pu, and Shuigeng Zhou. Focusing attention: Towards accurate text recognition in natural images. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 5086-5094. IEEE, 2017. [OpenAIRE]

188 references, page 1 of 13
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue