Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao https://doi.org/10.1...arrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
DBLP
Conference object . 2025
Data sources: DBLP
versions View all 2 versions
addClaim

OpenCL caffe

Accelerating and enabling a cross platform machine learning framework
Authors: Junli Gu; Yibing Liu; Yuan Gao; Maohua Zhu;

OpenCL caffe

Abstract

Deep neural networks (DNN) achieved significant breakthrough in vision recognition in 2012 and quickly became the leading machine learning algorithm in Big Data based large scale object recognition applications. The successful deployment of DNN based applications pose challenges for a cross platform software framework that enable multiple user scenarios, including offline model training on HPC clusters and online recognition in embedded environments. Existing DNN frameworks are mostly focused on a closed format CUDA implementations, which is limiting of deploy breadth of DNN hardware systems.This paper presents OpenCL™ caffe, which targets in transforming the popular CUDA based framework caffe [1] into open standard OpenCL backend. The goal is to enable a heterogeneous platform compatible DNN framework and achieve competitive performance based on OpenCL tool chain. Due to DNN models' high complexity, we use a two-phase strategy. First we introduce the OpenCL porting strategies that guarantee algorithm convergence; then we analyze OpenCL's performance bottlenecks in DNN domain and propose a few optimization techniques including batched manner data layout and multiple command queues to better map the problem size into existing BLAS library, improve hardware resources utilization and boost OpenCL runtime efficiency.We verify OpenCL caffe's successful offline training and online recognition on both server-end and consumer-end GPUs. Experimental results show that the phase-two's optimized OpenCL caffe achieved a 4.5x speedup without modifying BLAS library. The user can directly run mainstream DNN models and achieves the best performance for a specific processors by choosing the optimal batch number depending on H/W properties and input data size.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    23
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
23
Top 10%
Top 10%
Top 10%
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!