Downloads provided by UsageCounts
The advent of AI/ML in B5G and Multi-Access Edge Computing will rely on the acceleration of neural networks. The current work focuses on the acceleration of Long Short-Term Memory (LSTM) kernels playing a key role in numerous applications. We assume various LSTM sizes while targeting FPGA and GPU hardware for both embedded and server MEC purposes. Systematically, we perform a design space exploration to determine the most efficient acceleration approach and most suitable configuration for each device. We use High-Level-Synthesis to implement our proposed circuit architectures on Xilinx FPGAs, while we use high level tools for NVIDIA GPUs such as PyTorch’s JIT compiler or ONNX runtime. Our exploration shows that the full parallelization of an LSTM array multiplication quickly overutilizes the FPGA, while on GPUs LSTM models can be deployed more easily. Instead, the best approach for FPGAs is to find a balance between parallelizing LSTM gates and vector multiplications. Our comparative study shows that FPGAs prevail in light LSTM models, whereas GPUs prevail in larger model topologies. Moreover, we show that far- and near-edge FPGAs achieve similar latency, however, near-edge GPUs can achieve one order of magnitude faster execution than far-edge GPUs. The best results range in 0.3-5msec latency per execution with acceleration factors in 12×−174×.
GPU, Anomaly detection, LSTM, 5G, FPGA, Forecasting
GPU, Anomaly detection, LSTM, 5G, FPGA, Forecasting
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 12 | |
| downloads | 28 |

Views provided by UsageCounts
Downloads provided by UsageCounts