Grammatical Templates: Improving Text Difficulty Evaluation for Language Learners

Preprint English OPEN
Wang, Shuhan ; Andersen, Erik (2016)
  • Subject: Computer Science - Computation and Language | Computer Science - Artificial Intelligence

Language students are most engaged while reading texts at an appropriate difficulty level. However, existing methods of evaluating text difficulty focus mainly on vocabulary and do not prioritize grammatical features, hence they do not work well for language learners with limited knowledge of grammar. In this paper, we introduce grammatical templates, the expert-identified units of grammar that students learn from class, as an important feature of text difficulty evaluation. Experimental classification results show that grammatical template features significantly improve text difficulty prediction accuracy over baseline readability features by 7.4%. Moreover, we build a simple and human-understandable text difficulty evaluation approach with 87.7% accuracy, using only 5 grammatical template features.
  • References (36)
    36 references, page 1 of 4

    [Banno et al.2011] Eri Banno, Yoko Ikeda, and Yutaka Ohno. 2011. GENKI: An Integrated Course in Elementary Japanese. Japan Times and Tsai Fong Books.

    [Blyth1997] Carl Blyth. 1997. A constructivist approach to grammar: Teaching teachers to teach aspect. The Modern Language Journal, 81(1):50-66.

    [Callan and Eskenazi2007] Jamie Callan and Maxine Eskenazi. 2007. Combining lexical and grammatical features to improve readability measures for first and second language texts. In Proceedings of NAACL HLT, pages 460- 467.

    [Chang and Lin2011] Chih-Chung Chang and Chih-Jen Lin. 2011. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27.

    [Chang et al.2010] Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard, and Chih-Jen Lin. 2010. Training and testing low-degree polynomial data mappings via linear svm. The Journal of Machine Learning Research, 11:1471-1490.

    [Collins-Thompson and Callan2004] Kevyn Collins-Thompson and James P Callan. 2004. A language modeling approach to predicting reading difficulty. In HLT-NAACL, pages 193-200.

    [Curto et al.2015] Pedro Curto, Nuno Mamede, and Jorge Baptista. 2015. Assisting european portuguese teaching: Linguistic features extraction and automatic readability classifier. In Computer Supported Education, pages 81-96. Springer.

    [Fulcher1997] Glenn Fulcher. 1997. Text difficulty and accessibility: Reading formulae and expert judgement. System, 25(4):497-513.

    [Gonzalez-Dios et al.2014] Itziar Gonzalez-Dios, Mar´ıa Jesu´s Aranzabe, Arantza D´ıaz de Ilarraza, and Haritz Salaberri. 2014. Simple or complex? assessing the readability of basque texts. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 334-344, Dublin, Ireland, August. Dublin City University and Association for Computational Linguistics.

    [Hall et al.2009] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H Witten. 2009. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10-18.

  • Metrics
    No metrics available
Share - Bookmark