Downloads provided by UsageCounts
Vikidia.org is a children's encyclopedia, with content targeting 8-13 year old children, in several European languages. Our dataset contains 24660 texts distributed across 6165 articles in 2 reading levels, for English and French respectively i.e., each text in the corpus has four versions: en, en-simple, fr and fr-simple, and there are 6165 slugs in total. The uniqueness of the current dataset is that these are parallel, document level aligned texts in four versions - en, en-simple, fr, fr-simple. While we did not create paragraph/sentence level alignments on the corpus, we hope that this will be a useful dataset for future English and French research on ARA and Automatic Text Simplification. This is the first such dataset in ARA, and perhaps the first readily available French readability dataset. This dataset is used in the paper "A neural pairwise ranking model for automatic readability assessment" by Justin Lee and Sowmya Vajjala, to appear in Findings of ACL 2022.
Automatic Readability Assessment, Natural Language Processing, Text Simplification
Automatic Readability Assessment, Natural Language Processing, Text Simplification
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 34 | |
| downloads | 20 |

Views provided by UsageCounts
Downloads provided by UsageCounts