Computational Shedding in Stream Computing
Citation:
GUERIN, DAVID, Computational Shedding in Stream Computing, Trinity College Dublin.School of Computer Science & Statistics, 2019Abstract:
Stream Computing, a generic on-line data processing paradigm has emerged as the preferred approach in the processing of continuous data streams. Data streams suffer from a bursty characteristic where the data rate of the stream can spike temporarily up to orders of magnitude greater than normal expected levels. As such, producing timely application results is difficult as queues fill. The classic response to these temporary scenarios is to shed input data. However, Load Shedding (LS) impacts negatively on application output accuracy as relevant data is discarded before it is processed. Further, LS rates tend to be proportional to the input rate of the stream, as such high data rates can lead to high data loss during overload events.
For many classes of applications, this can have a particularly negative impact on the quality of the output result, given that data is simply not processed before it is shed.
This thesis presents a new approach, Computational Shedding (CS), to the problem of maintaining application result accuracy while attempting to forgo input data loss during transient busty data events.
Rather shedding input data within the stream, we propose to adapt the application and shed tasks or subtasks of the executing application temporarily, to reduce message process costs in the stream. As such, this mechanism provides for an opportunity to temporarily increase processing capacity thereby forgoing the need for deliberate data loss during a bursty data event.
We have evaluated this approach against traditional LS techniques in a number of ways, such as in terms of output application accuracy and application processing duration. In experimentation, we have found in applicable applications, subtasks can be discarded for a time. Remaining subtasks can continue to produce a valid imprecise result. CS was compared to LS alternatives which simply do not process any discarded data. We show that through the results of our evaluation, CS leads to more timely and accurate results when compared to LS alternatives.
Sponsor
Grant Number
TCD
Description:
APPROVED
Author: GUERIN, DAVID
Sponsor:
TCDAdvisor:
Barrett, EdmondPublisher:
Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer ScienceType of material:
ThesisCollections
Availability:
Full text availableMetadata
Show full item recordLicences:
Related items
Showing items related by title, author, creator and subject.
-
Peer assisted multicast streaming on-demand applications
O'Neill, John Paul (Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2016)On-demand multimedia streaming applications allow users to select and view content at a time of their choosing and to control the playback of this content. This level of control typically requires that each user receives ... -
Adaptable peer-to-peer internet live media streaming
Biskupski, Bartosz (Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2009)Media streaming is an approach to delivering media, which may consist of video and audio, from a provider to viewers. Media streaming enables simultaneous delivery and playback of media and thus provides an alternative to ...