Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing

Garland, James Philip; Gregg, David

dc.contributor.author	Garland, James Philip
dc.contributor.author	Gregg, David
dc.date.accessioned	2021-05-13T15:56:17Z
dc.date.available	2021-05-13T15:56:17Z
dc.date.issued	2018	en
dc.date.submitted	2018	en
dc.identifier.citation	James Philip Garland, David Gregg, 'Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing', 2018, ACM Transactions on Architecture and Code Optimization;, 15;, 3;	en
dc.identifier.issn	1544-3566
dc.identifier.other	Y
dc.description	PUBLISHED	en
dc.description.abstract	Convolutional neural networks (CNNs) are one of the most successful machine-learning techniques for image, voice, and video processing. CNNs require large amounts of processing capacity and memory bandwidth. Hardware accelerators have been proposed for CNNs that typically contain large numbers of multiply-accumulate (MAC) units, the multipliers of which are large in integrated circuit (IC) gate count and power consumption. “Weight-sharing” accelerators have been proposed where the full range of weight values in a trained CNN are compressed and put into bins, and the bin index is used to access the weight-shared value. We reduce power and area of the CNN by implementing parallel accumulate shared MAC (PASM) in a weight-shared CNN. PASM re-architects the MAC to instead count the frequency of each weight and place it in a bin. The accumulated value is computed in a subsequent multiply phase, significantly reducing gate count and power consumption of the CNN. In this article, we implement PASM in a weight-shared CNN convolution hardware accelerator and analyze its effectiveness. Experiments show that for a clock speed 1GHz implemented on a 45nm ASIC process our approach results in fewer gates, smaller logic, and reduced power with only a slight increase in latency. We also show that the same weight-shared-with-PASM CNN accelerator can be implemented in resource-constrained FPGAs, where the FPGA has limited numbers of digital signal processor (DSP) units to accelerate the MAC operations.	en
dc.format.extent	31:1	en
dc.format.extent	31:24	en
dc.language.iso	en	en
dc.relation.ispartofseries	ACM Transactions on Architecture and Code Optimization;
dc.relation.ispartofseries	15;
dc.relation.ispartofseries	3;
dc.rights	Y	en
dc.subject	same weight-shared-with-PASM	en
dc.subject	Convolutional neural networks (CNNs)	en
dc.subject	image, voice, and video processing	en
dc.subject.lcsh	same weight-shared-with-PASM	en
dc.subject.lcsh	Convolutional neural networks (CNNs)	en
dc.subject.lcsh	image, voice, and video processing	en
dc.title	Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing	en
dc.type	Journal Article	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/jgarland
dc.identifier.peoplefinderurl	http://people.tcd.ie/dgregg
dc.identifier.rssinternalid	215132
dc.rights.ecaccessrights	openAccess
dc.relation.source	ACM Transactions on Architecture and Code Optimization	en
dc.subject.TCDTag	Computer Hardware	en
dc.subject.TCDTag	Context-aware Computing	en
dc.identifier.rssuri	https://dl.acm.org/doi/10.1145/3233300
dc.relation.sourceuri	https://dl.acm.org/doi/10.1145/3233300	en
dc.identifier.orcid_id	0000-0002-8688-9407
dc.status.accessible	N	en
dc.contributor.sponsor	SFI stipend	en
dc.contributor.sponsorGrantNumber	12/IA/1381	en
dc.contributor.sponsor	Science Foundation Ireland	en
dc.identifier.uri	http://hdl.handle.net/2262/96284

Files in this item

Name:: 3233300.pdf
Size:: 6.342Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.424Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Scholarly Publications)
Computer Science (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing

Files in this item

This item appears in the following Collection(s)