Show simple item record

dc.contributor.authorGILL, MICHAELen
dc.contributor.authorMORRIS, DEREKen
dc.contributor.authorHERON, ELIZABETHen
dc.contributor.authorCORVIN, AIDENen
dc.contributor.authorPINTO, CARLOSen
dc.date.accessioned2011-05-30T12:08:53Z
dc.date.available2011-05-30T12:08:53Z
dc.date.issued2011en
dc.date.submitted2011en
dc.identifier.citationBridges M, Heron E, O'Dushlaine, Segurado R, The International Schizophrenia Consortium (ISC), Morris DW, Corvin A, Gill M, Pinto C. , Genetic Classification of Populations using Supervised Learning., PLos One, 6, 5, 2011, e14802en
dc.identifier.otherYen
dc.descriptionPUBLISHEDen
dc.description.abstractThere are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into predefined populations, particularly in quality control for large scale genome wide association studies.en
dc.format.extente14802en
dc.language.isoenen
dc.relation.ispartofseriesPLos Oneen
dc.relation.ispartofseries6en
dc.relation.ispartofseries5en
dc.rightsYen
dc.subjectGeneticsen
dc.subjectSAMPLE COVARIANCE MATRICESen
dc.titleGenetic Classification of Populations using Supervised Learning.en
dc.typeJournal Articleen
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/mgillen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/morrisdwen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/capintoen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/acorvinen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/eaheronen
dc.identifier.rssinternalid73327en
dc.identifier.doihttp://dx.doi.org/ 10.1371/journal.pone.0014802en
dc.subject.TCDThemeGenes & Societyen
dc.subject.TCDThemeNeuroscienceen
dc.identifier.rssurihttp://dx.doi.org/10.1371/journal.pone.0014802en
dc.contributor.sponsorScience Foundation Ireland (SFI)en
dc.contributor.sponsorWellcome Trusten
dc.identifier.urihttp://hdl.handle.net/2262/56190


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record