Publicatie

Acceleration-aware Fine-grained Channel Pruning for Deep Neural Networks via Residual Gating

Tijdschriftbijdrage - Tijdschriftartikel

Deep Neural Networks have achieved remarkable advancement in various intelligence tasks. However, the massive computation and storage consumption limit applications on resource-constrained devices. While channel pruning has been widely applied to compress models, it is challenging to reach very deep compressions for such a coarse-grained pruning structure without significant performance degradation. In this article, we propose an acceleration-aware fine-grained channel pruning (AFCP) framework for accelerating neural networks, which optimizes trainable gate parameters by estimating residual errors between pruned and original channels with hardware characteristics. Our fine-grained concept consists of both algorithm and structure levels. Different from existing methods that leverage a pre-defined pruning criterion, AFCP explicitly considers both zero-out and similar criteria for each channel and adaptively selects the suitable one via residual gate parameters. For structure level, AFCP adopts a fine-grained channel pruning strategy for residual neural networks and a decomposition-based structure, which further extends the pruning optimization space. Moreover, instead of using theoretical computation costs such as FLOPs, we propose the hardware predictor that bridges the gap between realistic acceleration and pruning procedure to guide the learning of pruning, which improves the efficiency of model pruning when deployed on accelerators. Extensive evaluation results demonstrate that AFCP outperforms state-of-the-art methods, and achieves a favorable balance between model performance and computation cost.

Tijdschrift: IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS

ISSN: 0278-0070

Issue: 6

Volume: 41

Pagina's: 1902 - 1915

Jaar van publicatie:2022

Trefwoorden:Index Terms-Deep learning system, model compression and acceleration, pruning, neural networks

WoS Id: 000799624800028
Handle: http://hdl.handle.net/1942/34409
DOI: https://doi.org/10.1109/tcad.2021.3093835

Toegankelijkheid:Open

Publicatie

Acceleration-aware Fine-grained Channel Pruning for Deep Neural Networks via Residual Gating

Tijdschriftbijdrage - Tijdschriftartikel

Auteurs/uitgever

Onderzoekseenheden