SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

Preprint English OPEN
Wang, Linnan; Ye, Jinmian; Zhao, Yiyang; Wu, Wei; Li, Ang; Song, Shuaiwen Leon; Xu, Zenglin; Kraska, Tim;
  • Related identifiers: doi: 10.1145/3178487.3178491
  • Subject: Computer Science - Distributed, Parallel, and Cluster Computing | Computer Science - Learning

Going deeper and wider in neural architectures improves the accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need change to less desired network architectures, or nontrivially diss... View more
  • References (28)
    28 references, page 1 of 3

    [1] Mxnet's graph representation of neural networks. htp:// architecture/note_memory.html.

    [2] Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al. Tensorflow: A system for large-scale machine learning. In OSDI (2016), vol. 16, pp. 265-283.

    [3] Bahrampour, S., Ramakrishnan, N., Schott, L., and Shah, M. Comparative study of cafe, neon, theano, and torch for deep learning.

    [4] Bengio, Y., Simard, P., and Frasconi, P. Learning long-term dependencies with gradient descent is dificult. IEEE transactions on neural networks 5, 2 (1994), 157-166.

    [5] Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., Xiao, T., Xu, B., Zhang, C., and Zhang, Z. Mxnet: A flexible and eficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).

    [6] Chen, T., Xu, B., Zhang, C., and Guestrin, C. Training deep nets with sublinear memory cost. arXiv preprint arXiv:1604.06174 (2016).

    [7] Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., and Shelhamer, E. cudnn: Eficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014).

    [8] Coates, A., Huval, B., Wang, T., Wu, D., Catanzaro, B., and Andrew, N. Deep learning with cots hpc systems. In International Conference on Machine Learning (2013), pp. 1337-1345.

    [9] Collobert, R., Bengio, S., and Mariéthoz, J. Torch: a modular machine learning software library. Tech. rep., Idiap, 2002.

    [10] Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q. V., et al. Large scale distributed deep networks. In Advances in neural information processing systems (2012), pp. 1223-1231.

  • Related Organizations (4)
  • Bioentities (1)
    3the Protein Data Bank
  • Metrics
Share - Bookmark