publication . Other literature type . Conference object . 2017

Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters

Khaled Hamidouche; Ammar Ahmad Awan; Dhabaleswar K. Panda; Jahanzeb Maqbool Hashmi;
Restricted
  • Published: 26 Jan 2017
  • Publisher: Association for Computing Machinery (ACM)
Abstract
Availability of large data sets like ImageNet and massively parallel computation support in modern HPC devices like NVIDIA GPUs have fueled a renewed interest in Deep Learning (DL) algorithms. This has triggered the development of DL frameworks like Caffe, Torch, TensorFlow, and CNTK. However, most DL frameworks have been limited to a single node. In order to scale out DL frameworks and bring HPC capabilities to the DL arena, we propose, S-Caffe; a scalable and distributed Caffe adaptation for modern multi-GPU clusters. With an in-depth analysis of new requirements brought forward by the DL frameworks and limitations of current communication runtimes, we present...
Subjects
free text keywords: Computer science, Deep learning, Cluster (physics), Caffè, Single node, Computation, Parallel computing, Speedup, Workflow, Scalability, Artificial intelligence, business.industry, business
Any information missing or wrong?Report an Issue