Show simple item record

dc.contributor.authorShanker, Shreejithen
dc.date.accessioned2023-08-22T09:00:12Z
dc.date.available2023-08-22T09:00:12Z
dc.date.createdAugust, 2023en
dc.date.issued2023en
dc.date.submitted2023en
dc.identifier.citationEmmet Murphy, Shashwat Khandelwal, Shanker Shreejith, Custom precision accelerators for energy-efficient image-to-image transformations in motion picture workflows, Applications of Digital Image Processing XLV., San Diego, USA, August, 2023, SPIE, 2023en
dc.identifier.otherYen
dc.descriptionPUBLISHEDen
dc.descriptionSan Diego, USAen
dc.description.abstractImage to Image (I2I) transformations have been an integral part of video processing workflows with applications in Image Synthesis for Virtual Productions, Segmentation, and Matting, among others. Over the years, deep learning-based approaches have been enabling new methods and tools for automating parts of the processing pipeline, reducing the human effort involved in post-production workflows. These compute-intensive models are often accelerated through on-premise or in-cloud GPU instances to improve the responsiveness and latency while expending large amounts of energy in performing these complex transformations. In this work, we present an approach for optimising the energy efficiency of I2I deep-learning models using quantised neural networks accelerated on a server-style FPGA. We use deep learning-based alpha background matting as the I2I application which is implemented using a U-Net conditional Generative Adversarial Network deep learning model. The model is trained and quantised using Vitis-AI flow from AMD/Xilinx and deployed on a data centre class Alveo U50 FPGA device. Our results show that the quantised model on the FPGA achieves a 1.14× higher throughput for inference acceleration while consuming 11× lower energy consumption per inference when compared to a GPU-accelerated version of the model on a 3080-Ti, while generating nearly identical results with an average IoU > 0.95 across multiple user images at 1080p and 4K resolutions. Additionally, offloads to the FPGA device can be seamlessly integrated into widely used motion picture tools like NUKE with minimal effort. With most cloud providers integrating heterogenous platforms (including FPGAs) into systems, we envision that this work paves the way for more efficient utilisation of custom precision deep-learning models and FPGA acceleration in deep learning-based motion picture workflows.en
dc.language.isoenen
dc.publisherSPIEen
dc.rightsYen
dc.subjectQuantised Deep Learningen
dc.subjectImage to Image Transformationsen
dc.subjectU-Neten
dc.titleCustom precision accelerators for energy-efficient image-to-image transformations in motion picture workflowsen
dc.title.alternativeApplications of Digital Image Processing XLV.en
dc.typeConference Paperen
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/shankersen
dc.identifier.rssinternalid257819en
dc.rights.ecaccessrightsopenAccess
dc.subject.TCDThemeCreative Technologiesen
dc.subject.TCDThemeMaking Irelanden
dc.subject.TCDThemeSmart & Sustainable Planeten
dc.subject.TCDTagField Programmable Gate Arrays (FPGAs)en
dc.subject.TCDTagImage Processingen
dc.subject.TCDTagMACHINE LEARNINGen
dc.subject.TCDTagReconfigurable Computingen
dc.subject.TCDTagVHDL, FPGA, DIGITAL DESIGNen
dc.subject.TCDTagVIDEO PROCESSINGen
dc.identifier.orcid_id0000-0002-9717-1804en
dc.status.accessibleNen
dc.identifier.urihttp://hdl.handle.net/2262/103756


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record