Abstract: Mixed-precision quantization is a popular approach for compressing deep neural networks (DNNs). However, it is challenging to scale the performance efficiently with mixed-precision DNNs ...
Abstract: Sparse Matrix-Matrix Multiplication(SpMM) is a commonly utilized operation in various domains, particularly in the increasingly popular Graph Neural Networks(GNN) framework. The current ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results