- University of Malta Malta
- European Research Infrastructure for Language Resources and Technology Netherlands
- INESC ID - INSTITUTO DE ENGENHARIADE SISTEMAS E COMPUTADORES, INVESTIGACAO E DESENVOLVIMENTO EM LISBOA Portugal
- National University of Ireland, Galway Ireland
- UNIVERZA V LJUBLJANI Slovenia
- Eurac Research, Institute for Applied Linguistics Italy
- Common Language Resources and Technology Infrastructure, European Research Infrastructure Consortium, Utrecht University Netherlands
- "Alexandru Ioan Cuza" University of Iași Romania
- Jerusalem College of Technology Israel
- CLARIN ERIC Netherlands
- Open University of Cyprus Cyprus
- Eurac Research Italy
- European Research Infrastructure for Language Resources and Technology Netherlands
- University of Helsinki Finland
- Eurac Research Italy
- University of Ljubljana Slovenia
We introduce in this paper a generic approach to combine implicit crowdsourcing and language learning in order to mass-produce language resources (LRs) for any language for which a crowd of language learners can be involved. We present the approach by explaining its core paradigm that consists in pairing specific types of LRs with specific exercises, by detailing both its strengths and challenges, and by discussing how much these challenges have been addressed at present. Accordingly, we also report on on-going proof-of-concept efforts aiming at developing the first prototypical implementation of the approach in order to correct and extend an LR called ConceptNet based on the input crowdsourced from language learners. We then present an international network called the European Network for Combining Language Learning with Crowdsourcing Techniques (enetCollect) that provides the context to accelerate the implementation of the generic approach. Finally, we exemplify how it can be used in several language learning scenarios to produce a multitude of NLP resources and how it can therefore alleviate the long-standing NLP issue of the lack of LRs.
peer-reviewed