Genetic Classification of Populations using Supervised Learning.

GILL, MICHAEL; MORRIS, DEREK; HERON, ELIZABETH; CORVIN, AIDEN; PINTO, CARLOS

dc.contributor.author	GILL, MICHAEL	en
dc.contributor.author	MORRIS, DEREK	en
dc.contributor.author	HERON, ELIZABETH	en
dc.contributor.author	CORVIN, AIDEN	en
dc.contributor.author	PINTO, CARLOS	en
dc.date.accessioned	2011-05-30T12:08:53Z
dc.date.available	2011-05-30T12:08:53Z
dc.date.issued	2011	en
dc.date.submitted	2011	en
dc.identifier.citation	Bridges M, Heron E, O'Dushlaine, Segurado R, The International Schizophrenia Consortium (ISC), Morris DW, Corvin A, Gill M, Pinto C. , Genetic Classification of Populations using Supervised Learning., PLos One, 6, 5, 2011, e14802	en
dc.identifier.other	Y	en
dc.description	PUBLISHED	en
dc.description.abstract	There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into predefined populations, particularly in quality control for large scale genome wide association studies.	en
dc.format.extent	e14802	en
dc.language.iso	en	en
dc.relation.ispartofseries	PLos One	en
dc.relation.ispartofseries	6	en
dc.relation.ispartofseries	5	en
dc.rights	Y	en
dc.subject	Genetics	en
dc.subject	SAMPLE COVARIANCE MATRICES	en
dc.title	Genetic Classification of Populations using Supervised Learning.	en
dc.type	Journal Article	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/mgill	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/morrisdw	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/capinto	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/acorvin	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/eaheron	en
dc.identifier.rssinternalid	73327	en
dc.identifier.doi	http://dx.doi.org/ 10.1371/journal.pone.0014802	en
dc.subject.TCDTheme	Genes & Society	en
dc.subject.TCDTheme	Neuroscience	en
dc.identifier.rssuri	http://dx.doi.org/10.1371/journal.pone.0014802	en
dc.contributor.sponsor	Science Foundation Ireland (SFI)	en
dc.contributor.sponsor	Wellcome Trust	en
dc.identifier.uri	http://hdl.handle.net/2262/56190

Files in this item

Name:: Genetic Classification of ...
Size:: 680.3Kb
Format:: PDF
Description:: Published (publisher's copy) - ...

View/Open

Name:: license.txt
Size:: 3.243Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Psychiatry (Scholarly Publications)
Psychiatry (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

Genetic Classification of Populations using Supervised Learning.

Files in this item

This item appears in the following Collection(s)