Convolution engine for neural networks. Interuniversity Microelectronics Centre
Current convolution engine platforms for use in CNN/DNN focus mostly on the multiplier-accumulator parallelisation and the parallel access to the (considered inevitable) distributed and shared memory organisation. In terms of circuit and technology implementation they nearly always use a 2D array circuit structure and planar CMOS devices to implement the MAC and the SRAMs for their architecture, realizing one stage of a CNN/DNN application. In the state-of-the-art these convolution engine organisations are not sufficiently optimized for energy and cost. They use costly volatile memory ...