One weird trick for parallelizing convolutional neural networks
Computer Science - Distributed, Parallel, and Cluster Computing | Computer Science - Neural and Evolutionary Computing | Computer Science - Learning
arxiv: Quantitative Biology::Neurons and Cognition | Computer Science::Neural and Evolutionary Computation
I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.