Mojo: MLIR-based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem

Name: Mojo: MLIR-based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem
Keywords: Computational Engineering, Finance, and Science (cs.CE), FOS: Computer and information sciences, Emerging Technologies (cs.ET), Computational Engineering, Finance, and Science, Programming Languages, Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC), Emerging Technologies, Programming Languages (cs.PL)

William Godoy; Tatiana Melnichenko; Pedro Valero-Lara; Wael Elwasif; Philip Fackler; Rafael Ferreira Da Silva; Keita Teranishi; Jeffrey Vetter

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2025

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1145/373159...

Article . 2025 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2025

License: CC BY

Data sources: Datacite

Mojo: MLIR-based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 15 Nov 2025Embargo end date: 01 Jan 2025Publisher:ACMJournal:Proceedings of the SC '25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Authors: William Godoy; Tatiana Melnichenko; Pedro Valero-Lara; Wael Elwasif; Philip Fackler; Rafael Ferreira Da Silva; Keita Teranishi; +1 Authors

doi: 10.1145/3731599.3767573 , 10.48550/arxiv.2509.21039

arXiv: 2509.21039

Mojo: MLIR-based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem

- Summary
- Subjects
- Metrics

Abstract

We explore the performance and portability of the novel Mojo language for scientific computing workloads on GPUs. As the first language based on the LLVM's Multi-Level Intermediate Representation (MLIR) compiler infrastructure, Mojo aims to close performance and productivity gaps by combining Python's interoperability and CUDA-like syntax for compile-time portable GPU programming. We target four scientific workloads: a seven-point stencil (memory-bound), BabelStream (memory-bound), miniBUDE (compute-bound), and Hartree-Fock (compute-bound with atomic operations); and compare their performance against vendor baselines on NVIDIA H100 and AMD MI300A GPUs. We show that Mojo's performance is competitive with CUDA and HIP for memory-bound kernels, whereas gaps exist on AMD GPUs for atomic operations and for fast-math compute-bound kernels on both AMD and NVIDIA GPUs. Although the learning curve and programming requirements are still fairly low-level, Mojo can close significant gaps in the fragmented Python ecosystem in the convergence of scientific computing and AI.

Accepted at the IEEE/ACM SC25 Conference WACCPD Workshop. The International Conference for High Performance Computing, Networking, Storage, and Analysis, St. Louis, MO, Nov 16-21, 2025. 15 pages, 7 figures. WFG and TM contributed equally

Related Organizations

University of Tennessee at Knoxville
United States
Oak Ridge National Laboratory
United States

Keywords

Computational Engineering, Finance, and Science (cs.CE), FOS: Computer and information sciences, Emerging Technologies (cs.ET), Computational Engineering, Finance, and Science, Programming Languages, Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC), Emerging Technologies, Programming Languages (cs.PL)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green