Learning to quantify: LeQua 2024 datasets

Learning to Quantify The aim of LeQua 2024 (the 2nd data challenge on Learning to Quantify) is to allow the comparative evaluation of methods for “learning to quantify” in textual datasets, i.e., methods for training predictors of the relative frequencies of the classes of interest in sets of unlabelled textual documents. These predictors (called “quantifiers”) will be required to issue predictions for several such sets, some of them characterized by class frequencies radically different from the ones of the training set. LeQua 2024 is supported by the SoBigData++ project, funded by the European Commission (Grant 871042) under the H2020 Programme INFRAIA-2019-1; by the AI4Media project, funded by the European Commission (Grant 951911) under the H2020 Programme ICT-48-2020; by the FAIR and QuaDaSh (Grant P2022TB5JF) projects, funded by the European Union under the NextGenerationEU funding scheme. The organizers’ opinions do not necessarily reflect those of the European Commission. Challenge website https://lequa2024.github.io/ Info on data and formats https://github.com/HLT-ISTI/LeQua2024_scripts Tasks Task T1: This task is concerned with evaluating binary quantifiers, i.e., quantifiers that must only predict the relative frequencies of a class and its complement; the data used are affected by prior probability shift (a.k.a. “label shift”). This task is akin to Task T1A of LeQua 2022. Task T2: This task is concerned with evaluating single-label multi-class quantifiers, i.e., quantifiers that operate on datapoints each belonging to exactly one among a set of n>2 classes; here too, the data used are affected by prior probability shift. This task is akin to Task T1B of LeQua 2022. Task T3: This task is concerned with evaluating ordinal quantifiers, i.e., quantifiers that operate on a set of n>2 totally ordered classes; here too, the data used are affected by prior probability shift. This task is new to LeQua 2024. Task T4: Like Task T1, this task is concerned with evaluating binary quantifiers; unlike in Task T1, the data used are affected by covariate shift. This task is new to LeQua 2024.

Related Organizations

Keywords

machine learning, prevalence estimation, quantification

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average