ViSQOL: an objective speech quality model

HARTE, NAOMI; KOKARAM, ANIL; HINES, ANDREW; KOKARAM, ANIL CHRISTOPHER; HARTE, NAOMI; HINES, ANDREW

dc.contributor.author	HARTE, NAOMI	en
dc.contributor.author	KOKARAM, ANIL	en
dc.contributor.author	HINES, ANDREW	en
dc.contributor.author	KOKARAM, ANIL CHRISTOPHER	en
dc.contributor.author	HARTE, NAOMI	en
dc.contributor.author	HINES, ANDREW	en
dc.date.accessioned	2015-12-09T12:41:36Z
dc.date.available	2015-12-09T12:41:36Z
dc.date.issued	2015	en
dc.date.submitted	2015	en
dc.identifier.citation	Hines A, Skoglund J, Kokaram A.C, Harte N, ViSQOL: an objective speech quality model, Eurasip Journal on Audio, Speech, and Music Processing, 2015, 1, 2015, 13-	en
dc.identifier.other	Y	en
dc.description	PUBLISHED	en
dc.description	Export Date: 27 August 2015	en
dc.description.abstract	This paper presents an objective speech quality model, ViSQOL, the Virtual Speech Quality Objective Listener. It is a signal-based, full-reference, intrusive metric that models human speech quality perception using a spectro-temporal measure of similarity between a reference and a test speech signal. The metric has been particularly designed to be robust for quality issues associated with Voice over IP (VoIP) transmission. This paper describes the algorithm and compares the quality predictions with the ITU-T standard metrics PESQ and POLQA for common problems in VoIP: clock drift, associated time warping, and playout delays. The results indicate that ViSQOL and POLQA significantly outperform PESQ, with ViSQOL competing well with POLQA. An extensive benchmarking against PESQ, POLQA, and simpler distance metrics using three speech corpora (NOIZEUS and E4 and the ITU-T P.Sup. 23 database) is also presented. These experiments benchmark the performance for a wide range of quality impairments, including VoIP degradations, a variety of background noise types, speech enhancement methods, and SNR levels. The results and subsequent analysis show that both ViSQOL and POLQA have some performance weaknesses and under-predict perceived quality in certain VoIP conditions. Both have a wider application and robustness to conditions than PESQ or more trivial distance metrics. ViSQOL is shown to offer a useful alternative to POLQA in predicting speech quality in VoIP scenarios.	en
dc.description.sponsorship	Andrew Hines thanks Google, Inc. for support. Thanks also to Yi Hu for sharing the full listener test MOS results and enhanced test files for the NOIZEUS database	en
dc.format.extent	13	en
dc.relation.ispartofseries	Eurasip Journal on Audio, Speech, and Music Processing	en
dc.relation.ispartofseries	2015	en
dc.relation.ispartofseries	1	en
dc.rights	Y	en
dc.subject	Objective speech quality; POLQA; P.853; PESQ; ViSQOL; NSIM	en
dc.subject.lcsh	Objective speech quality; POLQA; P.853; PESQ; ViSQOL; NSIM	en
dc.title	ViSQOL: an objective speech quality model	en
dc.type	Journal Article	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/nharte	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/akokaram	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/ahines	en
dc.identifier.rssinternalid	105760	en
dc.identifier.doi	http://dx.doi.org/10.1186/s13636-015-0054-9	en
dc.rights.ecaccessrights	openAccess
dc.identifier.rssuri	http://www.scopus.com/inward/record.url?eid=2-s2.0-84930212854&partnerID=40&md5=9e6b8eb966a3dd6e2a497dd672a26f54	en
dc.identifier.uri	http://hdl.handle.net/2262/75238

Files in this item

Name:: s13636-015-0054-9.pdf
Size:: 1.800Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.419Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Electronic & Electrical Eng (Scholarly Publications)
Electronic & Electrical Eng (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

ViSQOL: an objective speech quality model

Files in this item

This item appears in the following Collection(s)