Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ http://arxiv.org/pdf...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.1109/cluste...
Article . 2018 . Peer-reviewed
Data sources: Crossref
https://dx.doi.org/10.48550/ar...
Article . 2017
License: arXiv Non-Exclusive Distribution
Data sources: Datacite
DBLP
Article . 2017
Data sources: DBLP
DBLP
Conference object
Data sources: DBLP
versions View all 5 versions
addClaim

Efficient Training of Convolutional Neural Nets on Large Distributed Systems

Authors: Dheeraj Sreedhar; Vaibhav Saxena; Yogish Sabharwal; Ashish Verma 0001; Sameer Kumar 0001;

Efficient Training of Convolutional Neural Nets on Large Distributed Systems

Abstract

Deep Neural Networks (DNNs) have achieved im- pressive accuracy in many application domains including im- age classification. Training of DNNs is an extremely compute- intensive process and is solved using variants of the stochastic gradient descent (SGD) algorithm. A lot of recent research has focussed on improving the performance of DNN training. In this paper, we present optimization techniques to improve the performance of the data parallel synchronous SGD algorithm using the Torch framework: (i) we maintain data in-memory to avoid file I/O overheads, (ii) we present a multi-color based MPI Allreduce algorithm to minimize communication overheads, and (iii) we propose optimizations to the Torch data parallel table framework that handles multi-threading. We evaluate the performance of our optimizations on a Power 8 Minsky cluster with 32 nodes and 128 NVidia Pascal P100 GPUs. With our optimizations, we are able to train 90 epochs of the ResNet-50 model on the Imagenet-1k dataset using 256 GPUs in just 48 minutes. This significantly improves on the previously best known performance of training 90 epochs of the ResNet-50 model on the same dataset using 256 GPUs in 65 minutes. To the best of our knowledge, this is the best known training performance demonstrated for the Imagenet- 1k dataset.

Keywords

FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    2
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
2
Average
Average
Average
Green