IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators

Shanker, Shreejith

dc.contributor.author	Shanker, Shreejith	en
dc.date.accessioned	2022-05-31T16:47:53Z
dc.date.available	2022-05-31T16:47:53Z
dc.date.created	July 2022	en
dc.date.issued	2022	en
dc.date.submitted	2022	en
dc.identifier.citation	Eashan Wadhwa, Shashwat Khandelwal, Shreejith Shanker, IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators, 33rd IEEE International Conference on Application-specific Systems, Architectures and Processors, Gothenburg, Sweden, July 2022, IEEE, 2022	en
dc.identifier.other	Y	en
dc.description	PUBLISHED	en
dc.description	Gothenburg, Sweden	en
dc.description.abstract	Quantised convolution neural networks (QCNNs) on FPGAs have shown tremendous potential for deploying deep learning on resource constrained devices closer to the data source or in embedded applications. An essential building block of (Q)CNNs are the convolutional layers. FPGA implementations use modified versions of convolution kernels to reduce the resource overheads using variations of the sliding kernel algorithm. While these alleviate resource consumption to a certain degree, they still incur considerable (distributed) memory resources, requiring the use of larger FPGA devices with sufficient on-chip memory elements to implement deep QCNNs. In this paper, we present the Inverse Memory Efficient Convolution (IMEC) algorithm, a novel strategy to lower the memory consumption of convolutional layers in QCNNs. IMEC lowers the footprint of intermediate matrix buffers incurred within the convolutional layers and the multiply- accumulate (MAC) operators required at each layer through a series of data organisation and computational optimisations. We evaluate IMEC by integrating it into the BNN-PYNQ framework that can compile high-level QCNN representations to the FPGA bitstream. Our results show that IMEC can optimise memory footprint and the overall resource overhead of the convolutional layers by ∼33% and ∼20% (LUT and FF count) respectively, across multiple quantisation levels (1-bit to 8-bit), while maintaining identical inference accuracy as the state-of-the-art QCNN implementations.	en
dc.language.iso	en	en
dc.publisher	IEEE	en
dc.rights	Y	en
dc.subject	Research Subject Categories::TECHNOLOGY	en
dc.subject	Convolution Neural Networks	en
dc.subject	Field Programmable Gate Arrays	en
dc.subject	Inference Algorithms	en
dc.title	IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators	en
dc.title.alternative	33rd IEEE International Conference on Application-specific Systems, Architectures and Processors	en
dc.type	Conference Paper	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/shankers	en
dc.identifier.rssinternalid	243374	en
dc.rights.ecaccessrights	openAccess
dc.relation.source	FINN	en
dc.subject.TCDTheme	Making Ireland	en
dc.subject.TCDTheme	Smart & Sustainable Planet	en
dc.subject.TCDTheme	Telecommunications	en
dc.subject.TCDTag	ARTIFICIAL NEURAL NETWORKS	en
dc.subject.TCDTag	Quantised Neural Networks	en
dc.subject.TCDTag	Reconfigurable Computing	en
dc.relation.sourceuri	https://xilinx.github.io/finn/	en
dc.identifier.orcid_id	0000-0002-9717-1804	en
dc.status.accessible	N	en
dc.identifier.uri	http://hdl.handle.net/2262/98718

Files in this item

Name:: IMEC-preprint-EW.pdf
Size:: 893.0Kb
Format:: PDF
Description:: Accepted for publication (author's ...

View/Open

Name:: license.txt
Size:: 3.424Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Electronic & Electrical Eng (Scholarly Publications)
Electronic & Electrical Eng (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators

Files in this item

This item appears in the following Collection(s)