Hardware accelerator architecture for convolutional neural network. Interuniversity Microelectronics Centre
A hardware accelerator architecture (10) for a convolutional neural network comprises a first memory (11) for storing NxM activation inputs of an input tensor; a plurality of processor units (12) each comprising a plurality of Multiply ACcumulate (MAC) arrays (13) and a filter weights memory (14) associated with and common to the plurality of MAC arrays of one processor unit (12). Each MAC array is adapted for receiving a predetermined fraction (FxF) of the NxM activation inputs from the first memory, and filter weights from the associated filter weights memory (14). Each MAC array is adapted ...