
## Complete Benchmark Dataset Systematic energy efficiency measurements for quantized language models across 0.5B-14B parameters on NVIDIA Ada Lovelace (RTX 4090D), Blackwell (RTX 5090), and Ampere (A800 80GB) architectures. **113+ configurations** covering five precision methods: FP16, NF4, INT8 (default), INT8 (pure bnb), and FP8. ### What's Included - Complete metadata and experimental configurations - Raw energy measurements (RTX 4090D, RTX 5090, A800 80GB) - Model coverage: Qwen2, TinyLlama, Mistral, Yi-1.5 - Data quality: CV < 2%, n=2 repeated trials ### Key Findings - Small-Model Quantization Paradox: +25-56% energy for models <3B - Break-even threshold: 4.2B (Ada) / 5.2B (Blackwell) - INT8 default is 4.6x less efficient than NF4 for small models - FP8 Paradox: up to +701% energy overhead on RTX 5090 due to software immaturity ### Try It Interactively **EcoCompute ClawHub Skill**: Query these benchmarks conversationally with the EcoLobster AI advisor. https://clawhub.ai/hongping-zh/ecocompute ### Documentation See [data/README.md](https://github.com/hongping-zh/ecocompute-ai/tree/main/data) for full documentation, citation format, and quick start guide. ### Interactive Dashboard https://hongping-zh.github.io/ecocompute-dynamic-eval/ --- **License**: CC BY 4.0 | **Citation**: See data/README.md ### Community Adoption - Referenced in [HuggingFace Optimum official documentation](https://huggingface.co/docs/optimum/concept_guides/quantization) ([PR #2410](https://github.com/huggingface/optimum/pull/2410), merged Mar 2026) - Dataset mirrored on [HuggingFace Hub](https://huggingface.co/datasets/hongpingzhang/ecocompute-energy-efficiency) - Available as interactive AI skill on [ClawHub](https://clawhub.ai/hongping-zh/ecocompute) - FP8 energy anomaly confirmed by [torchao maintainers](https://github.com/pytorch/ao/issues/4094) - Related contributions: [bitsandbytes PR #1882](https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1882), [Transformers PR #44407](https://github.com/huggingface/transformers/pull/44407) --- 操作步骤
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
