Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Publications Open Re...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
IEEE Transactions on Nuclear Science
Article . 2022 . Peer-reviewed
License: IEEE Copyright
Data sources: Crossref
https://dx.doi.org/10.60692/yz...
Other literature type . 2022
Data sources: Datacite
https://dx.doi.org/10.48550/ar...
Article . 2021
License: arXiv Non-Exclusive Distribution
Data sources: Datacite
https://dx.doi.org/10.60692/6e...
Other literature type . 2022
Data sources: Datacite
versions View all 9 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Experimental Findings on the Sources of Detected Unrecoverable Errors in GPUs

النتائج التجريبية على مصادر الأخطاء المكتشفة غير القابلة للاسترداد في وحدات معالجة الرسومات
Authors: Fernando Fernandes dos Santos; Sujit Malde; Carlo Cazzaniga; Chris Frost; Luigi Carro; Paolo Rech;

Experimental Findings on the Sources of Detected Unrecoverable Errors in GPUs

Abstract

Investigamos las fuentes de errores irrecuperables (DUE) detectados en unidades de procesamiento gráfico (GPU) expuestas a un haz de neutrones. Los accesos ilegales a la memoria y los errores de interfaz se encuentran entre las fuentes más probables de DUE. El código de corrección de errores (ECC) aumenta los eventos de fallo de lanzamiento. Nuestro procedimiento de prueba ha demostrado que el ECC puede reducir los DUE causados por el acceso ilegal a la dirección hasta en un 92% para Kepler y hasta en un 98% para Volta. Además, analizamos si las optimizaciones del compilador pueden afectar la distribución de fuentes DEBIDAS para la multiplicación de matrices. Descubrimos que los códigos de máquina generados por los diferentes niveles de optimización pueden cambiar la fuente DEBIDA en no más del 24% de media.

Nous étudions les sources d'erreurs irrécupérables (DUE) détectées dans les unités de traitement graphique (GPU) exposées à un faisceau de neutrons. Les accès illégaux à la mémoire et les erreurs d'interface sont parmi les sources les plus probables de DUE. Le code correcteur d'erreurs (ECC) augmente les événements d'échec de lancement. Notre procédure de test a montré que l'ECC peut réduire les DUE causés par l'accès aux adresses illégales jusqu'à 92 % pour Kepler et jusqu'à 98 % pour Volta. De plus, nous analysons si les optimisations du compilateur peuvent avoir un impact sur la distribution des sources DUES pour la multiplication matricielle. Nous avons constaté que les codes machine générés par les différents niveaux d'optimisation peuvent changer la source DUE de pas plus de 24% en moyenne.

We investigate the sources of detected unrecoverable errors (DUEs) in graphics processing units (GPUs) exposed to a neutron beam. Illegal memory accesses and interface errors are among the more likely sources of DUEs. Error-correcting code (ECC) increases the launch failure events. Our test procedure has shown that ECC can reduce the DUEs caused by Illegal Address access up to 92% for Kepler and up to 98% for Volta. In addition, we analyze whether the compiler optimizations can impact the DUE sources distribution for the matrix multiplication. We found that the machine codes generated by the different optimization levels can change the DUE source by no more than 24% on average.

نقوم بالتحقيق في مصادر الأخطاء المكتشفة غير القابلة للاسترداد (DUEs) في وحدات معالجة الرسومات (GPUs) المعرضة لشعاع نيوتروني. تعد عمليات الوصول غير القانونية إلى الذاكرة وأخطاء الواجهة من بين المصادر الأكثر ترجيحًا لمصادر التخلف عن الاستعمال. يزيد رمز تصحيح الخطأ (ECC) من أحداث فشل الإطلاق. أظهر إجراء الاختبار الخاص بنا أن ECC يمكن أن يقلل من DUEs الناجم عن الوصول إلى العنوان غير القانوني بنسبة تصل إلى 92 ٪ لكبلر وما يصل إلى 98 ٪ لفولتا. بالإضافة إلى ذلك، نقوم بتحليل ما إذا كانت تحسينات المترجم يمكن أن تؤثر على توزيع المصادر المستحقة لضرب المصفوفة. وجدنا أن رموز الماكينة التي تم إنشاؤها بواسطة مستويات التحسين المختلفة يمكن أن تغير المصدر المستحق بنسبة لا تزيد عن 24 ٪ في المتوسط.

Keywords

FOS: Computer and information sciences, Parallel computing, Interface (matter), Multiplication (music), Fault Tolerance, Compiler, Set (abstract data type), Engineering, Error Detection, FOS: Electrical engineering, electronic engineering, information engineering, Parallel Computing and Performance Optimization, GPU Computing, Electrical and Electronic Engineering, Source code, Code (set theory), Bubble, Physics, Detected unrecoverable error (DUE); graphic processing units; radiation experiments; reliability, Acoustics, Low-Power VLSI Circuit Design and Optimization, Computer science, Programming language, Operating system, Fault Tolerance in Electronic Systems, Graphics processing unit, Computer Science - Distributed, Parallel, and Cluster Computing, Hardware and Architecture, Physical Sciences, Computer Science, Graphics, Distributed, Parallel, and Cluster Computing (cs.DC), Soft Errors, Transient Faults, Maximum bubble pressure method

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    5
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 3
    download downloads 9
  • 3
    views
    9
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
5
Top 10%
Average
Top 10%
3
9
Green
bronze