Multimodal continuous turn-taking prediction using multiscale RNNs
Citation:
Roddy, M., Skantze, G., Harte, N. Multimodal continuous turn-taking prediction using multiscale RNNs, ICMI’18, October 16-20, 2018, Boulder, CO, USADownload Item:
Abstract:
In human conversational interactions, turn-taking exchanges can be coordinated using cues from multiple modalities. To design spoken dialog systems that can conduct fluid interactions it is desirable to incorporate cues from separate modalities into turn-taking models. We propose that there is an appropriate temporal granularity at which modalities should be modeled. We design a multiscale RNN architecture to model modalities at separate timescales in a continuous manner. Our results show that modeling linguistic and acoustic features at separate temporal rates can be beneficial for turn-taking modeling. We also show that our approach can be used to incorporate gaze features into turn-taking models.
Sponsor
Grant Number
Science Foundation Ireland
13/RC/210
Author's Homepage:
http://people.tcd.ie/nharte
Author: Harte, Naomi
Sponsor:
Science Foundation IrelandOther Titles:
ICMI 2018 - 20th ACM International Conference on Multimodal InteractionType of material:
Conference PaperAvailability:
Full text availableKeywords:
Spoken dialog systems, Turn-taking modelingDOI:
http://dx.doi.org/10.1145/3242969.3242997Metadata
Show full item recordLicences: