If you haven't previously registered, please do so here:
Deep Convolution Neural Networks (CNNs) are dominant in modern Machine Learning (ML) applications. Their acceleration directly in hardware calls for a multifaceted approach that combines high-performance, energy efficiency, and functional safety. These requirements hold for every CNN architecture, including both systolic and spatial dataflow alternatives. In this talk, we focus on the latter case, where CNNs are implemented using a series of dedicated convolutions engines, where the data are streamed from one layer to the other through custom memory buffers.
In this context, we will first present a High-Level Synthesis (HLS) implementation for dataflow CNNs that utilizes the benefits of Catapult HLS and allows the implementation of a wide variety of architectures. Especially, we will focus on the energy-efficient implementation of non-traditional forms of spatial convolutions, such as strided or dilated convolutions, which leverage the decomposition of convolution to eliminate any redundant data movements.
In the following, we will present an algorithmic approach for online fault detection on CNNs. In this case, the checksum of the actual result is checked against a predicted checksum computed in parallel by a hardware checker. Based on a newly introduced invariance condition of convolution, the proposed checker predicts the output checksum implicitly using only data elements at the border of the input features. In this way, the power required for accumulating the input features is reduced without requiring large buffers to hold intermediate checksum results.
Finally, we study customized floating point HLS operators that support fused dot products of single or double-width output (i.e., input in FP8 and the result in FP16) that eliminate redundant rounding and type conversion steps. Also, floating-point datatypes that support adjustable bias for tuning the dynamic range will be discussed.
What you will learn
Who should attend