HuggingFace Datasets Integration This release integrates HuggingFace datasets as the core dataset management interface, removing previous custom downloaders. What's Changed Refactor Task downloading to use HuggingFace.datasets by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/300 Add templates and update docs by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/308 Add dataset features to TriviaQA by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/305 Add SWAG by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/306 Fixes for using lm_eval as a library by @dirkgr in https://github.com/EleutherAI/lm-evaluation-harness/pull/309 Researcher2 by @researcher2 in https://github.com/EleutherAI/lm-evaluation-harness/pull/261 Suggested updates for the task guide by @StephenHogg in https://github.com/EleutherAI/lm-evaluation-harness/pull/301 Add pre-commit by @Mistobaan in https://github.com/EleutherAI/lm-evaluation-harness/pull/317 Decontam import fix by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/321 Add bootstrap_iters kwarg by @Muennighoff in https://github.com/EleutherAI/lm-evaluation-harness/pull/322 Update decontamination.md by @researcher2 in https://github.com/EleutherAI/lm-evaluation-harness/pull/331 Fix key access in squad evaluation metrics by @konstantinschulz in https://github.com/EleutherAI/lm-evaluation-harness/pull/333 Fix make_disjoint_window for tail case by @richhankins in https://github.com/EleutherAI/lm-evaluation-harness/pull/336 Manually concat tokenizer revision with subfolder by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/343 [deps] Use minimum versioning for numexpr by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/352 Remove custom datasets that are in HF by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/330 Add TextSynth API by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/299 Add the original LAMBADA dataset by @jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/357 New Contributors @dirkgr made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/309 @Mistobaan made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/317 @konstantinschulz made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/333 @richhankins made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/336 Full Changelog: https://github.com/EleutherAI/lm-evaluation-harness/compare/v0.2.0...v0.3.0

Related Organizations

Max Planck Institute for Software Systems
Germany
IBM (United States)
United States
Peking University
China (People's Republic of)
Charles University
Czech Republic
Max Planck Society
Germany

View all View all

5 Research products, page 1 of 1

EleutherAI/lm-evaluation-harness: Major refactor
2023HasVersion
CarperAI/trlx: v0.4
2023IsAmongTopNSimilarDocuments
CarperAI/trlx: v0.7.0: NeMO PPO, PEFT Migration, and Fixes
2023IsAmongTopNSimilarDocuments
A framework for few-shot language model evaluation
2021HasVersion
CarperAI/trlx: v0.5.0: Initial NeMo integration, HH example, and improved Hugging Face integration
2023IsAmongTopNSimilarDocuments

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	28
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%