Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software
Data sources: ZENODO
addClaim

NaviMed-UMB: hardware envelope studies for local AI deployment on consumer RDNA 4 GPUs

Authors: Minarowski, Łukasz;

NaviMed-UMB: hardware envelope studies for local AI deployment on consumer RDNA 4 GPUs

Abstract

An engineering log and benchmark suite documenting the practical envelope of running modern large language models (up to 70B parameters) on a consumer-grade dual AMD Radeon AI PRO R9700 32 GB workstation under ROCm 7.2 and vLLM 0.19. Version 0.1.0 documents the working configurations for Qwen 3.6 27B (released 2026-04-22) on this hardware, including a quantization-performance inversion finding (BF16 outpaces FP8 by approximately 75% under the current software stack) attributable to the absence of R9700-specific FP8 kernel configurations in vLLM. Intended audience includes researchers preparing local AI infrastructure for privacy-sensitive workloads, hardware reviewers seeking reproducible methodology, and software maintainers working on RDNA 4 support in inference frameworks.

Powered by OpenAIRE graph
Found an issue? Give us feedback