publication . Preprint . Other literature type . Part of book or chapter of book . Article . 2018

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Pengyuan Lyu; Minghui Liao; Cong Yao; Wenhao Wu; Xiang Bai;
Open Access English
  • Published: 05 Jul 2018
Abstract
Unifying text detection and text recognition in an end-to-end training fashion has become a new trend for reading text in the wild, as these two tasks are highly relevant and complementary. In this paper, we investigate the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images. An end-to-end trainable neural network named as Mask TextSpotter is presented. Different from the previous text spotters that follow the pipeline consisting of a proposal generation network and a sequence-to-sequence recognition network, Mask TextSpotter enjoys a simple and smooth end-to-end learning procedure, in which both detection ...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Artificial intelligence, business.industry, business, Artificial neural network, Spotting, Computer science, Text detection, End-to-end principle, Text recognition, Deep neural networks, Computer vision, Segmentation, Computational Theory and Mathematics, Software, Applied Mathematics, Computer Vision and Pattern Recognition, Pattern recognition
Related Organizations
60 references, page 1 of 4

1. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proc. ICML. pp. 41{48 (2009) [OpenAIRE]

2. Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: Photoocr: Reading text in uncontrolled conditions. In: Proc. ICCV. pp. 785{792 (2013) [OpenAIRE]

3. Busta, M., Neumann, L., Matas, J.: Deep textspotter: An end-to-end trainable scene text localization and recognition framework. In: Proc. ICCV. pp. 2223{2231 (2017) [OpenAIRE]

4. Chng, C.K., Chan, C.S.: Total-text: A comprehensive dataset for scene text detection and recognition. In: Proc. ICDAR. pp. 935{942 (2017)

5. Dai, J., He, K., Li, Y., Ren, S., Sun, J.: Instance-sensitive fully convolutional networks. In: Proc. ECCV. pp. 534{549 (2016)

6. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Proc. NIPS. pp. 379{387 (2016)

7. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proc. CVPR. pp. 2963{2970 (2010) [OpenAIRE]

8. Girshick, R.B.: Fast R-CNN. In: Proc. ICCV. pp. 1440{1448 (2015)

9. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR. pp. 580{ 587 (2014)

10. Gomez, L., Karatzas, D.: Textproposals: a text-speci c selective search algorithm for word spotting in the wild. Pattern Recognition 70, 60{74 (2017)

11. Graves, A., Fernandez, S., Gomez, F.J., Schmidhuber, J.: Connectionist temporal classi cation: labelling unsegmented sequence data with recurrent neural networks. In: Proc. ICML. pp. 369{376 (2006)

12. Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: Proc. CVPR. pp. 2315{2324 (2016)

13. He, K., Gkioxari, G., Dollar, P., Girshick, R.B.: Mask R-CNN. In: Proc. ICCV. pp. 2980{2988 (2017)

14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. CVPR. pp. 770{778 (2016)

15. He, P., Huang, W., He, T., Zhu, Q., Qiao, Y., Li, X.: Single shot text detector with regional attention. In: Proc. ICCV. pp. 3066{3074 (2017)

60 references, page 1 of 4
Abstract
Unifying text detection and text recognition in an end-to-end training fashion has become a new trend for reading text in the wild, as these two tasks are highly relevant and complementary. In this paper, we investigate the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images. An end-to-end trainable neural network named as Mask TextSpotter is presented. Different from the previous text spotters that follow the pipeline consisting of a proposal generation network and a sequence-to-sequence recognition network, Mask TextSpotter enjoys a simple and smooth end-to-end learning procedure, in which both detection ...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Artificial intelligence, business.industry, business, Artificial neural network, Spotting, Computer science, Text detection, End-to-end principle, Text recognition, Deep neural networks, Computer vision, Segmentation, Computational Theory and Mathematics, Software, Applied Mathematics, Computer Vision and Pattern Recognition, Pattern recognition
Related Organizations
60 references, page 1 of 4

1. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proc. ICML. pp. 41{48 (2009) [OpenAIRE]

2. Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: Photoocr: Reading text in uncontrolled conditions. In: Proc. ICCV. pp. 785{792 (2013) [OpenAIRE]

3. Busta, M., Neumann, L., Matas, J.: Deep textspotter: An end-to-end trainable scene text localization and recognition framework. In: Proc. ICCV. pp. 2223{2231 (2017) [OpenAIRE]

4. Chng, C.K., Chan, C.S.: Total-text: A comprehensive dataset for scene text detection and recognition. In: Proc. ICDAR. pp. 935{942 (2017)

5. Dai, J., He, K., Li, Y., Ren, S., Sun, J.: Instance-sensitive fully convolutional networks. In: Proc. ECCV. pp. 534{549 (2016)

6. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Proc. NIPS. pp. 379{387 (2016)

7. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proc. CVPR. pp. 2963{2970 (2010) [OpenAIRE]

8. Girshick, R.B.: Fast R-CNN. In: Proc. ICCV. pp. 1440{1448 (2015)

9. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR. pp. 580{ 587 (2014)

10. Gomez, L., Karatzas, D.: Textproposals: a text-speci c selective search algorithm for word spotting in the wild. Pattern Recognition 70, 60{74 (2017)

11. Graves, A., Fernandez, S., Gomez, F.J., Schmidhuber, J.: Connectionist temporal classi cation: labelling unsegmented sequence data with recurrent neural networks. In: Proc. ICML. pp. 369{376 (2006)

12. Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: Proc. CVPR. pp. 2315{2324 (2016)

13. He, K., Gkioxari, G., Dollar, P., Girshick, R.B.: Mask R-CNN. In: Proc. ICCV. pp. 2980{2988 (2017)

14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. CVPR. pp. 770{778 (2016)

15. He, P., Huang, W., He, T., Zhu, Q., Qiao, Y., Li, X.: Single shot text detector with regional attention. In: Proc. ICCV. pp. 3066{3074 (2017)

60 references, page 1 of 4
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Preprint . Other literature type . Part of book or chapter of book . Article . 2018

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Pengyuan Lyu; Minghui Liao; Cong Yao; Wenhao Wu; Xiang Bai;