publication . Doctoral thesis . 2016

Détection et reconnaissance du texte arabe incrusté dans les vidéos

Yousfi, Sonia;
Open Access English
  • Published: 06 Jul 2016
  • Publisher: HAL CCSD
Abstract
This thesis focuses on Arabic embedded text detection and recognition in videos. Different approaches robust to Arabic text variability (fonts, scales, sizes, etc.) as well as to environmental and acquisition condition challenges (contrasts, degradation, complex background, etc.) are proposed. We introduce different machine learning-based solutions for robust text detection without relying on any pre-processing. The first method is based on Convolutional Neural Networks (ConvNet) while the others use a specific boosting cascade to select relevant hand-crafted text features. For the text recognition, our methodology is segmentation-free. Text images are transform...
Subjects
free text keywords: Information Technology, Neural networks, Informatique, Réseaux de neurones, Deep learning, Video contents, Optical character recognition, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Arabic text, Apprentissage profond, Texte arabe, Contenus vidéo, Reconnaissance optique de caractères
Download from
44 references, page 1 of 3

J. Bernsen. Dynamic thresholding of grey-level images. In International conference on pattern recognition, pages 1251-1255, 1986.

C.M. Bishop. Neural networks for pattern recognition. Oxford university press, 1995.

C.M. Bishop. Bishop pattern recognition and machine learning, 2001.

Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, et al. Greedy layer-wise training of deep networks. Advances in neural information processing systems, 19:153, 2007.

S. Banerjee, K. Mullick, and U. Bhattacharya. A robust approach to extraction of texts from camera captured images. In International Workshop on Camera-Based Document Analysis and Recognition, pages 30-46. 2013.

The Dallas project, SICS technical report, 2002.

A.L. Berger, V.J.D. Pietra, and S.A.D. Pietra. A maximum entropy approach to natural language processing. Computational linguistics, 22(1):39-71, 1996.

T.M. Breuel. The ocropus open source ocr system. In Electronic Imaging 2008, pages 68150F-68150F. International Society for Optics and Photonics, 2008.

B. Bushofa and M. Spann. Segmentation and recognition of arabic characters by structural classification. Image and Vision Computing, 15(3):167-179, 1997.

Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157-166, 1994.

In International Conference on Document Analysis and Recognition (ICDAR), pages 683-687, 2013.

D. Crandall, S. Antani, and R. Kasturi. Extraction of special effects caption text events from digital video. International Journal on Document Analysis and Recognition, 5(2-3):138-157, 2003.

A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D.J. Wu, and A.Y. Ng. Text detection and character recognition in scene images with unsupervised feature learning. In International Conference on Document Analysis and Recognition (ICDAR), pages 440-445, 2011. [OpenAIRE]

S.F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4):359-393, 1999.

D. Chen. Text detection and recognition in images and video sequences. Technical report, 2003.

44 references, page 1 of 3
Abstract
This thesis focuses on Arabic embedded text detection and recognition in videos. Different approaches robust to Arabic text variability (fonts, scales, sizes, etc.) as well as to environmental and acquisition condition challenges (contrasts, degradation, complex background, etc.) are proposed. We introduce different machine learning-based solutions for robust text detection without relying on any pre-processing. The first method is based on Convolutional Neural Networks (ConvNet) while the others use a specific boosting cascade to select relevant hand-crafted text features. For the text recognition, our methodology is segmentation-free. Text images are transform...
Subjects
free text keywords: Information Technology, Neural networks, Informatique, Réseaux de neurones, Deep learning, Video contents, Optical character recognition, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Arabic text, Apprentissage profond, Texte arabe, Contenus vidéo, Reconnaissance optique de caractères
Download from
44 references, page 1 of 3

J. Bernsen. Dynamic thresholding of grey-level images. In International conference on pattern recognition, pages 1251-1255, 1986.

C.M. Bishop. Neural networks for pattern recognition. Oxford university press, 1995.

C.M. Bishop. Bishop pattern recognition and machine learning, 2001.

Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, et al. Greedy layer-wise training of deep networks. Advances in neural information processing systems, 19:153, 2007.

S. Banerjee, K. Mullick, and U. Bhattacharya. A robust approach to extraction of texts from camera captured images. In International Workshop on Camera-Based Document Analysis and Recognition, pages 30-46. 2013.

The Dallas project, SICS technical report, 2002.

A.L. Berger, V.J.D. Pietra, and S.A.D. Pietra. A maximum entropy approach to natural language processing. Computational linguistics, 22(1):39-71, 1996.

T.M. Breuel. The ocropus open source ocr system. In Electronic Imaging 2008, pages 68150F-68150F. International Society for Optics and Photonics, 2008.

B. Bushofa and M. Spann. Segmentation and recognition of arabic characters by structural classification. Image and Vision Computing, 15(3):167-179, 1997.

Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157-166, 1994.

In International Conference on Document Analysis and Recognition (ICDAR), pages 683-687, 2013.

D. Crandall, S. Antani, and R. Kasturi. Extraction of special effects caption text events from digital video. International Journal on Document Analysis and Recognition, 5(2-3):138-157, 2003.

A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D.J. Wu, and A.Y. Ng. Text detection and character recognition in scene images with unsupervised feature learning. In International Conference on Document Analysis and Recognition (ICDAR), pages 440-445, 2011. [OpenAIRE]

S.F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4):359-393, 1999.

D. Chen. Text detection and recognition in images and video sequences. Technical report, 2003.

44 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue