
AOL4FOLTR is the first learning-to-rank (LTR) dataset designed specifically for evaluating federated online learning-to-rank (FOLTR) algorithms. Including user identifiers and timestamps, this dataset allows for the simulation of real user behavior with heterogeneous data and in asynchronous federated learning settings. The dataset consists of two files letor.txt.gz (55G uncompressed) metadata.csv letor.txt contains the query-document pairs for all query logs in standard LETOR format. Each query-document pair holds a binary label derived from user clicks, and is further represented by a 103-dimensional vector. We document the features in our code repository. The query logs are cross-referenced (by qid) in metadata.csv, where contextual information is provided. This includes the user, timestamp, raw query, the target document ID, and a list of 20 candidate documents. The document IDs and user IDs directly map to the AOL-IA dataset; the query IDs do not. For access to the raw document contents, please refer to this dataset.
Learning to Rank, Information Storage and Retrieval, Federated Learning
Learning to Rank, Information Storage and Retrieval, Federated Learning
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
