Sparse linear algebra (SLA) operations are essential in many applications such as data analytics, graph processing, machine learning, and scientific computing. However, it is challenging to build efficient hardware accelerators for SLA operations since they typically exhibit low operational intensity and irregular compute and data access patterns. In particular, some of these challenges are not well studied in the context of High-Level Synthesis (HLS).
In this talk, we first introduce HiSparse, an accelerator on sparse-matrix dense-vector multiplication (SpMV). To achieve a high bandwidth utilization, we co-design the sparse storage format and the accelerator architecture. We further demonstrate the use of Catapult HLS to build a high-throughput pipeline that can handle irregular data dependencies and access patterns. Building on our SpMV accelerator, we further develop a versatile sparse accelerator that can support multiple SLA operations with run-time configurability to support different compute patterns. Our architecture design is guided by a novel analytical model which enables rapid exploration of the design configuration search space. According to our evaluation using both HLS implementation and simulation, our sparse accelerators deliver promising speedup with increased bandwidth and energy efficiency when compared to CPU and GPU executions.