Computational Shedding in Stream Computing

GUERIN, DAVID

dc.contributor.advisor	Barrett, Edmond	en
dc.contributor.author	GUERIN, DAVID	en
dc.date.accessioned	2019-09-18T06:23:18Z
dc.date.available	2019-09-18T06:23:18Z
dc.date.issued	2019	en
dc.date.submitted	2019	en
dc.identifier.citation	GUERIN, DAVID, Computational Shedding in Stream Computing, Trinity College Dublin.School of Computer Science & Statistics, 2019	en
dc.identifier.other	Y	en
dc.description	APPROVED	en
dc.description.abstract	Stream Computing, a generic on-line data processing paradigm has emerged as the preferred approach in the processing of continuous data streams. Data streams suffer from a bursty characteristic where the data rate of the stream can spike temporarily up to orders of magnitude greater than normal expected levels. As such, producing timely application results is difficult as queues fill. The classic response to these temporary scenarios is to shed input data. However, Load Shedding (LS) impacts negatively on application output accuracy as relevant data is discarded before it is processed. Further, LS rates tend to be proportional to the input rate of the stream, as such high data rates can lead to high data loss during overload events. For many classes of applications, this can have a particularly negative impact on the quality of the output result, given that data is simply not processed before it is shed. This thesis presents a new approach, Computational Shedding (CS), to the problem of maintaining application result accuracy while attempting to forgo input data loss during transient busty data events. Rather shedding input data within the stream, we propose to adapt the application and shed tasks or subtasks of the executing application temporarily, to reduce message process costs in the stream. As such, this mechanism provides for an opportunity to temporarily increase processing capacity thereby forgoing the need for deliberate data loss during a bursty data event. We have evaluated this approach against traditional LS techniques in a number of ways, such as in terms of output application accuracy and application processing duration. In experimentation, we have found in applicable applications, subtasks can be discarded for a time. Remaining subtasks can continue to produce a valid imprecise result. CS was compared to LS alternatives which simply do not process any discarded data. We show that through the results of our evaluation, CS leads to more timely and accurate results when compared to LS alternatives.	en
dc.publisher	Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science	en
dc.rights	Y	en
dc.subject	Stream Computing	en
dc.subject	Computational Shedding	en
dc.subject	Load Shedding	en
dc.subject	Data Stream Processing	en
dc.subject	Stream Computing Framework	en
dc.subject	Bursty Data Streams	en
dc.subject	Approximate Result	en
dc.title	Computational Shedding in Stream Computing	en
dc.type	Thesis	en
dc.type.supercollection	thesis_dissertations	en
dc.type.supercollection	refereed_publications	en
dc.type.qualificationlevel	Doctoral	en
dc.identifier.peoplefinderurl	https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:DGUERIN	en
dc.identifier.rssinternalid	206918	en
dc.rights.ecaccessrights	openAccess
dc.contributor.sponsor	TCD	en
dc.identifier.uri	http://hdl.handle.net/2262/89513