loading page

Special Session: Reliability Assessment Recipes for DNN Accelerators
  • +11
  • Alberto Bosio,
  • Mohammad Hasan Ahmadilivani,
  • Bastien Deveautour,
  • Fernando Fernandes dos Santos,
  • Juan David Guerrero Balaguera,
  • Maksim Jenihhin,
  • Angeliki Kritikakou,
  • Robert Limas Sierra,
  • Salvatore Pappalardo,
  • Jaan Raik,
  • JosieE Rodriguez Condia,
  • Matteo Sonza Reorda,
  • Mahdi Taheri,
  • Marcello Traiola
Alberto Bosio
Ecole Centrale de Lyon, CPE Lyon, INL
Mohammad Hasan Ahmadilivani
Tallinn University of Technology
Bastien Deveautour
Ecole Centrale de Lyon, CPE Lyon, INL
Fernando Fernandes dos Santos
Univ Rennes, CNRS
Juan David Guerrero Balaguera
Politecnico di Torino
Maksim Jenihhin
Tallinn University of Technology
Angeliki Kritikakou
Univ Rennes, CNRS
Robert Limas Sierra
Politecnico di Torino
Salvatore Pappalardo
Ecole Centrale de Lyon, CPE Lyon, INL
Jaan Raik
Tallinn University of Technology
JosieE Rodriguez Condia
Politecnico di Torino
Matteo Sonza Reorda
Politecnico di Torino
Mahdi Taheri
Tallinn University of Technology

Corresponding Author:[email protected]

Author Profile
Marcello Traiola
Univ Rennes, CNRS

Abstract

Reliability assessment is mandatory to guarantee the correct behavior of Deep Neural Network (DNN) hardware accelerators in safety-critical applications. While fault injection stands out as a well-established, practical and robust method for reliability assessment, it is still a very time-consuming process. This paper contributes with three recipes for optimizing the efficiency of the reliability assessment: a) hybrid analytical and hierarchical FI-based reliability assessment for systolic-array-based DNN accelerators; b) mixing techniques for the reliability assessment of in-chip AI accelerators in GPUs; c) reliability assessment of DNN hardware accelerators through physical fault injection. The experimental results demonstrate the efficiency of the proposed methods applied to their target DNN HW accelerator platforms.  
07 May 2024Submitted to TechRxiv
13 May 2024Published in TechRxiv