Publicatie

Structured precision skipping: Accelerating convolutional neural networks with budget-aware dynamic precision selection

Tijdschriftbijdrage - Tijdschriftartikel

Despite the remarkable advancement in various intelligence tasks achieved by Convolutional Neural Networks, the massive computation and storage consumption limit applications on resource-constrained devices. Existing works explore to reduce computation cost by leveraging the input-dependent redundancy at runtime. The irregular dynamic sparsity distribution, however, limits the real speedup for dynamic models deployed in traditional neural network accelerators. To solve this problem, we propose an algorithm-architecture co-design, named structured precision skipping (SPS), to exploit the dynamic precision redundancy in statically quantized models. SPS computes most neurons in a lower precision and only a small portion of important neurons in a higher precision to preserve performance. Specifically, we first propose the structured dynamic block to exploit the dynamic sparsity in a structured manner. Based on the block, we then apply a budget-aware training method by inducing a budget regularization to learn the precision skipping under a target resource constraint. Finally, we present an architecture design based on the bit-serial architecture with support for SPS models, where only a predict controller module with small overhead is introduced. Extensive evaluation results demonstrate that SPS can achieve up to 1.5× speedup and 1.4× energy saving on various models and datasets with marginal accuracy loss.

Tijdschrift: JOURNAL OF SYSTEMS ARCHITECTURE

ISSN: 1383-7621

Volume: 124

Pagina's: 102403

Jaar van publicatie:2022

Trefwoorden:Convolutional neural networks, Algorithm-architecture co-design, Model compression and acceleration, Dynamic quantization

WoS Id: 000782573200004
DOI: https://doi.org/10.1016/j.sysarc.2022.102403
Handle: http://hdl.handle.net/1942/36647

Toegankelijkheid:Closed

Publicatie

Structured precision skipping: Accelerating convolutional neural networks with budget-aware dynamic precision selection

Tijdschriftbijdrage - Tijdschriftartikel

Auteurs/uitgever

Onderzoekseenheden