Show simple item record

dc.contributor.advisorGobl, Christer
dc.contributor.authorKane, John
dc.date.accessioned2016-12-01T09:56:59Z
dc.date.available2016-12-01T09:56:59Z
dc.date.issued2012
dc.identifier.citationJohn Kane, 'Tools for analysing the voice : developments in glottal source and quality analysis', [thesis], Trinity College (Dublin, Ireland). Centre for Language and Communication Studies, 2012, pp 257
dc.identifier.otherTHESIS 10336
dc.description.abstractThis thesis documents a range of research carried out on the topic of glottal source and voice quality analysis. Initially, a review is given of the physiological and acoustic correlates of different vocal settings. This is followed by a discussion of the importance of glottal source and voice quality variation in spoken communication, and the impact of modelling these aspects on speech technology. Despite the potential benefit of acoustic characterisation of the glottal source for speech technology existing algorithms often suffer from a lack of robustness. To address this, the present thesis describes and evaluates a set of novel algorithms aimed at improving the robustness. The algorithms come under two headings; fine-grained, glottal synchronous methods and coarse-grained, voice quality detection methods. In terms of fine-grained methods a new algorithm, SE-VQ, has been developed which is optimised for analysis of a range of voice qualities. While maintaining the precision of the state-of-the-art on neutral speech, the new algorithm is shown to signihcantly improve performance on creaky voice regions. SE-VQ is then utilised as part of a novel LF model based parameterisation method (DyProg-LF) of estimated glottal source signals. The dynamic programming algorithm used in DyProg-LF is shown to avoid the coimnon problem of inconsistencies in parameter trajectories and is shown to provide better parameterisation than the state-of-the-art on both a carefully controlled dataset with manually obtained reference values as well as on a larger speech dataset. For coarse-grained methods, a new parameter, the Maxima Dispersion Quotient (MDQ), is proposed for discriminating breathy to tense voice. MDQ was shown to outperform existing parameters for discriminating the voice qualities, particularly for continuous speech, and also in terms of robustness to additive noise. A new method for detecting creaky voice is also described which utilises two parameters derived from the Linear Prediction-residual signal. These parameters are used as input features to a decision tree classifier which is shown to significantly outperform the state-of-the-art on a range of speech data varying in terms of speaker, gender, language, recording condition and speaking style. Finally, a software package, the Voice analysis toolkit, which contains the algorithms developed as part of this thesis, has been made publicly available. This has been done to encourage usage of the newly developed algorithms in applied work and future algorithm evaluations.
dc.format1 volume
dc.language.isoen
dc.publisherTrinity College (Dublin, Ireland). Centre for Language and Communication Studies
dc.relation.isversionofhttp://stella.catalogue.tcd.ie/iii/encore/record/C__Rb15661945
dc.subjectLinguistic, Speech and Communication Sciences, Ph.D.
dc.subjectPh.D. Trinity College Dublin
dc.titleTools for analysing the voice : developments in glottal source and quality analysis
dc.typethesis
dc.type.supercollectionthesis_dissertations
dc.type.supercollectionrefereed_publications
dc.type.qualificationlevelDoctoral
dc.type.qualificationnameDoctor of Philosophy (Ph.D.)
dc.rights.ecaccessrightsopenAccess
dc.format.extentpaginationpp 257
dc.description.noteTARA (Trinity’s Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ie
dc.identifier.urihttp://hdl.handle.net/2262/78032


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record