Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

descriptionPublicationkeyboard_double_arrow_right Article , Conference object , Other literature type 05 Jul 2021Publisher:ACMJournal:Proceedings of the Platform for Advanced Scientific Computing ConferenceFunded by:DFG | unidentified

Authors: Tobias Kenter; Adesh Shambhu; Sara Faghih-Naini; Vadym Aizinger;

doi: 10.1145/3468267.3470617

Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

- Summary
- Related research
  (9)
- Metrics

Abstract

We present the first FPGA implementation of the full simulation pipeline of a shallow water code based on the discontinuous Galerkin method. Using OpenCL and following an algorithm-hardware codesign approach, the software reference is transformed into a dataflow architecture that can process a full mesh element per clock cycle. The novel projection approach on the algorithmic level complements the pipeline and memory optimizations in the hardware design. With this, the FPGA kernels for different polynomial orders outperform the CPU reference by 43x -- 144x in a strong scaling benchmark scenario. A performance model can explain the measured FPGA performance of up to 717 GFLOPs accurately.

Related Organizations

University of Paderborn
Germany
University of Bayreuth
Germany

9 Research products, page 1 of 1

ALGORITHM-HARDWARE CODESIGN OF A FAST PARALLEL ROUTING ARCHITECTURE FOR CLOS NETWORKS
2010IsAmongTopNSimilarDocuments
An Algorithm-Hardware Co-design Framework to Overcome Imperfections of Mixed-signal DNN Accelerators
2022IsAmongTopNSimilarDocuments
Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework
2018IsAmongTopNSimilarDocuments
FPGA Implementation of Real-time Star Centroid Extraction Algorithm
2019IsAmongTopNSimilarDocuments
An Algorithm–Hardware Co-Optimized Framework for Accelerating N:M Sparse Transformers
2022IsAmongTopNSimilarDocuments
Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference
2020IsAmongTopNSimilarDocuments
Enabling Energy-Efficient and Robust Machine Intelligence with Algorithm-Hardware Co-Design
2020IsAmongTopNSimilarDocuments
Synetgy
2019IsAmongTopNSimilarDocuments
Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars
2019IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	14
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%