schliessen

Filtern

 

Bibliotheken

Testing the Reliability of Genetic Methods of Species Identification via Simulation

Although genetic methods of species identification, especially DNA barcoding, are strongly debated, tests of these methods have been restricted to a few empirical cases for pragmatic reasons. Here we use simulation to test the performance of methods based on sequence comparison (BLAST and genetic di... Full description

Journal Title: Systematic biology 2008-04, Vol.57 (2), p.216-230
Main Author: Ross, Howard A.
Other Authors: Murugan, Sumathi , Sibon Li, Wai Lok
Format: Electronic Article Electronic Article
Language: English
Subjects:
DNA
Quelle: Alma/SFX Local Collection
Publisher: England: Taylor & Francis
ID: ISSN: 1063-5157
Link: https://www.ncbi.nlm.nih.gov/pubmed/18398767
Zum Text:
SendSend as email Add to Book BagAdd to Book Bag
Staff View
recordid: cdi_proquest_miscellaneous_70489287
title: Testing the Reliability of Genetic Methods of Species Identification via Simulation
format: Article
creator:
  • Ross, Howard A.
  • Murugan, Sumathi
  • Sibon Li, Wai Lok
subjects:
  • Animals
  • Bar codes
  • Biological taxonomies
  • Biological variation
  • BLAST
  • Blasts
  • Computer Simulation
  • Datasets
  • DNA
  • DNA barcoding
  • DNA, Mitochondrial - genetics
  • Evolution
  • Evolutionary biology
  • Evolutionary genetics
  • Fungi - genetics
  • Genetic Speciation
  • Genetics
  • Invertebrates - genetics
  • Models, Genetic
  • Monophyly
  • phylogenetic
  • Phylogenetics
  • Sampling techniques
  • Sensitivity and Specificity
  • Simulation
  • Species
  • species identification
  • Taxonomy
  • Whales - genetics
ispartof: Systematic biology, 2008-04, Vol.57 (2), p.216-230
description: Although genetic methods of species identification, especially DNA barcoding, are strongly debated, tests of these methods have been restricted to a few empirical cases for pragmatic reasons. Here we use simulation to test the performance of methods based on sequence comparison (BLAST and genetic distance) and tree topology over a wide range of evolutionary scenarios. Sequences were simulated on a range of gene trees spanning almost three orders of magnitude in tree depth and in coalescent depth; that is, deep or shallow trees with deep or shallow coalescences. When the query's conspecific sequences were included in the reference alignment, the rate of positive identification was related to the degree to which different species were genetically differentiated. The BLAST, distance, and liberal tree-based methods returned higher rates of correct identification than did the strict tree-based requirement that the query was within, but not sister to, a single-species clade. Under this more conservative approach, ambiguous outcomes occurred in inverse proportion to the number of reference sequences per species. When the query's conspecific sequences were not in the reference alignment, only the strict tree-based approach was relatively immune to making false-positive identifications. Thresholds affected the rates at which false-positive identifications were made when the query's species was unrepresented in the reference alignment but did not otherwise influence outcomes. A conservative approach using the strict tree-based method should be used initially in large-scale identification systems, with effort made to maximize sequence sampling within species. Once the genetic variation within a taxonomic group is well characterized and the taxonomy resolved, then the choice of method used should be dictated by considerations of computational efficiency. The requirement for extensive genetic sampling may render these techniques inappropriate in some circumstances.
language: eng
source: Alma/SFX Local Collection
identifier: ISSN: 1063-5157
fulltext: fulltext
issn:
  • 1063-5157
  • 1076-836X
url: Link


@attributes
NO1
SEARCH_ENGINEprimo_central_multiple_fe
SEARCH_ENGINE_TYPEPrimo Central Search Engine
RANK2.6369007
LOCALfalse
PrimoNMBib
record
control
sourceidjstor_proqu
recordidTN_cdi_proquest_miscellaneous_70489287
sourceformatXML
sourcesystemPC
jstor_id20143136
oup_id10.1080/10635150802032990
sourcerecordid20143136
originalsourceidFETCH-LOGICAL-1471t-6489e86240a5ab2e03fe2f1af055b95c0117862421b52023465b48a2eb6628443
addsrcrecordideNqNUEtP3DAYtKqi8mh_QA-gqAdOpPUjfuRIES_t0lZdKqFeLCf7BbzNxovtIPj3OGRFJZCqnvzZ38x4ZhD6SPBnghX-QrBgnPA0UsxoWeI3aItgKXLFxNXbYRYsTwC5ibZDWGBMiODkHdokipVKCrmFZpcQou2us3gD2U9oralsa-ND5prsFDqIts4uIN64eRieZiuoLYTsfA5dtI2tTbSuy-6syWZ22bdP1_doozFtgA_rcwf9Ojm-PDrLp99Pz48OpzkpJIm5KFQJStACG24qCpg1QBtiGsx5VfI6uZXDmpKKU0xZIXhVKEOhEoKqomA7aH_UXXl326ccemlDDW1rOnB90BKnH6iSCfjpBXDhet8lb5qUhcLJj0ggMoJq70Lw0OiVt0vjHzTBeqhbv6o7cfbWwn21hPlfxrrfBJAvRGsbn0qK3tj2n9IHI9P1q_9ysjvCFyE6_0ygKRsjbEiXj3sbItw_743_o5NNyfXZ1W89-Tr99mPCJ_qCPQKmPq4U
sourcetypeAggregation Database
isCDItrue
recordtypearticle
pqid194801476
display
typearticle
titleTesting the Reliability of Genetic Methods of Species Identification via Simulation
sourceAlma/SFX Local Collection
creatorRoss, Howard A. ; Murugan, Sumathi ; Sibon Li, Wai Lok
contributorHedin, Marshal ; Hedin, Marshal
creatorcontribRoss, Howard A. ; Murugan, Sumathi ; Sibon Li, Wai Lok ; Hedin, Marshal ; Hedin, Marshal
descriptionAlthough genetic methods of species identification, especially DNA barcoding, are strongly debated, tests of these methods have been restricted to a few empirical cases for pragmatic reasons. Here we use simulation to test the performance of methods based on sequence comparison (BLAST and genetic distance) and tree topology over a wide range of evolutionary scenarios. Sequences were simulated on a range of gene trees spanning almost three orders of magnitude in tree depth and in coalescent depth; that is, deep or shallow trees with deep or shallow coalescences. When the query's conspecific sequences were included in the reference alignment, the rate of positive identification was related to the degree to which different species were genetically differentiated. The BLAST, distance, and liberal tree-based methods returned higher rates of correct identification than did the strict tree-based requirement that the query was within, but not sister to, a single-species clade. Under this more conservative approach, ambiguous outcomes occurred in inverse proportion to the number of reference sequences per species. When the query's conspecific sequences were not in the reference alignment, only the strict tree-based approach was relatively immune to making false-positive identifications. Thresholds affected the rates at which false-positive identifications were made when the query's species was unrepresented in the reference alignment but did not otherwise influence outcomes. A conservative approach using the strict tree-based method should be used initially in large-scale identification systems, with effort made to maximize sequence sampling within species. Once the genetic variation within a taxonomic group is well characterized and the taxonomy resolved, then the choice of method used should be dictated by considerations of computational efficiency. The requirement for extensive genetic sampling may render these techniques inappropriate in some circumstances.
identifier
0ISSN: 1063-5157
1EISSN: 1076-836X
2DOI: 10.1080/10635150802032990
3PMID: 18398767
languageeng
publisherEngland: Taylor & Francis
subjectAnimals ; Bar codes ; Biological taxonomies ; Biological variation ; BLAST ; Blasts ; Computer Simulation ; Datasets ; DNA ; DNA barcoding ; DNA, Mitochondrial - genetics ; Evolution ; Evolutionary biology ; Evolutionary genetics ; Fungi - genetics ; Genetic Speciation ; Genetics ; Invertebrates - genetics ; Models, Genetic ; Monophyly ; phylogenetic ; Phylogenetics ; Sampling techniques ; Sensitivity and Specificity ; Simulation ; Species ; species identification ; Taxonomy ; Whales - genetics
ispartofSystematic biology, 2008-04, Vol.57 (2), p.216-230
rights
0Copyright 2008 Society of Systematic Biologists
12008 Society of Systematic Biologists 2008
lds50peer_reviewed
oafree_for_read
citedbyFETCH-LOGICAL-1471t-6489e86240a5ab2e03fe2f1af055b95c0117862421b52023465b48a2eb6628443
citesFETCH-LOGICAL-1471t-6489e86240a5ab2e03fe2f1af055b95c0117862421b52023465b48a2eb6628443
links
openurl$$Topenurl_article
openurlfulltext$$Topenurlfull_article
thumbnail$$Usyndetics_thumb_exl
backlink$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18398767$$D View this record in MEDLINE/PubMed
search
contributor
0Hedin, Marshal
1Hedin, Marshal
creatorcontrib
0Ross, Howard A.
1Murugan, Sumathi
2Sibon Li, Wai Lok
title
0Testing the Reliability of Genetic Methods of Species Identification via Simulation
1Systematic biology
addtitleSyst Biol
descriptionAlthough genetic methods of species identification, especially DNA barcoding, are strongly debated, tests of these methods have been restricted to a few empirical cases for pragmatic reasons. Here we use simulation to test the performance of methods based on sequence comparison (BLAST and genetic distance) and tree topology over a wide range of evolutionary scenarios. Sequences were simulated on a range of gene trees spanning almost three orders of magnitude in tree depth and in coalescent depth; that is, deep or shallow trees with deep or shallow coalescences. When the query's conspecific sequences were included in the reference alignment, the rate of positive identification was related to the degree to which different species were genetically differentiated. The BLAST, distance, and liberal tree-based methods returned higher rates of correct identification than did the strict tree-based requirement that the query was within, but not sister to, a single-species clade. Under this more conservative approach, ambiguous outcomes occurred in inverse proportion to the number of reference sequences per species. When the query's conspecific sequences were not in the reference alignment, only the strict tree-based approach was relatively immune to making false-positive identifications. Thresholds affected the rates at which false-positive identifications were made when the query's species was unrepresented in the reference alignment but did not otherwise influence outcomes. A conservative approach using the strict tree-based method should be used initially in large-scale identification systems, with effort made to maximize sequence sampling within species. Once the genetic variation within a taxonomic group is well characterized and the taxonomy resolved, then the choice of method used should be dictated by considerations of computational efficiency. The requirement for extensive genetic sampling may render these techniques inappropriate in some circumstances.
subject
0Animals
1Bar codes
2Biological taxonomies
3Biological variation
4BLAST
5Blasts
6Computer Simulation
7Datasets
8DNA
9DNA barcoding
10DNA, Mitochondrial - genetics
11Evolution
12Evolutionary biology
13Evolutionary genetics
14Fungi - genetics
15Genetic Speciation
16Genetics
17Invertebrates - genetics
18Models, Genetic
19Monophyly
20phylogenetic
21Phylogenetics
22Sampling techniques
23Sensitivity and Specificity
24Simulation
25Species
26species identification
27Taxonomy
28Whales - genetics
issn
01063-5157
11076-836X
fulltexttrue
rsrctypearticle
creationdate2008
recordtypearticle
recordideNqNUEtP3DAYtKqi8mh_QA-gqAdOpPUjfuRIES_t0lZdKqFeLCf7BbzNxovtIPj3OGRFJZCqnvzZ38x4ZhD6SPBnghX-QrBgnPA0UsxoWeI3aItgKXLFxNXbYRYsTwC5ibZDWGBMiODkHdokipVKCrmFZpcQou2us3gD2U9oralsa-ND5prsFDqIts4uIN64eRieZiuoLYTsfA5dtI2tTbSuy-6syWZ22bdP1_doozFtgA_rcwf9Ojm-PDrLp99Pz48OpzkpJIm5KFQJStACG24qCpg1QBtiGsx5VfI6uZXDmpKKU0xZIXhVKEOhEoKqomA7aH_UXXl326ccemlDDW1rOnB90BKnH6iSCfjpBXDhet8lb5qUhcLJj0ggMoJq70Lw0OiVt0vjHzTBeqhbv6o7cfbWwn21hPlfxrrfBJAvRGsbn0qK3tj2n9IHI9P1q_9ysjvCFyE6_0ygKRsjbEiXj3sbItw_743_o5NNyfXZ1W89-Tr99mPCJ_qCPQKmPq4U
startdate200804
enddate200804
creator
0Ross, Howard A.
1Murugan, Sumathi
2Sibon Li, Wai Lok
general
0Taylor & Francis
1Oxford University Press
scope
0BSCLL
1CGR
2CUY
3CVF
4ECM
5EIF
6NPM
7AAYXX
8CITATION
9K9.
107X8
sort
creationdate200804
titleTesting the Reliability of Genetic Methods of Species Identification via Simulation
authorRoss, Howard A. ; Murugan, Sumathi ; Sibon Li, Wai Lok
facets
frbrtype5
frbrgroupidcdi_FETCH-LOGICAL-1471t-6489e86240a5ab2e03fe2f1af055b95c0117862421b52023465b48a2eb6628443
rsrctypearticles
prefilterarticles
languageeng
creationdate2008
topic
0Animals
1Bar codes
2Biological taxonomies
3Biological variation
4BLAST
5Blasts
6Computer Simulation
7Datasets
8DNA
9DNA barcoding
10DNA, Mitochondrial - genetics
11Evolution
12Evolutionary biology
13Evolutionary genetics
14Fungi - genetics
15Genetic Speciation
16Genetics
17Invertebrates - genetics
18Models, Genetic
19Monophyly
20phylogenetic
21Phylogenetics
22Sampling techniques
23Sensitivity and Specificity
24Simulation
25Species
26species identification
27Taxonomy
28Whales - genetics
toplevel
0peer_reviewed
1online_resources
creatorcontrib
0Ross, Howard A.
1Murugan, Sumathi
2Sibon Li, Wai Lok
collection
0Istex
1Medline
2MEDLINE
3MEDLINE (Ovid)
4MEDLINE
5MEDLINE
6PubMed
7CrossRef
8ProQuest Health & Medical Complete (Alumni)
9MEDLINE - Academic
jtitleSystematic biology
delivery
delcategoryRemote Search Resource
fulltextfulltext
addata
au
0Ross, Howard A.
1Murugan, Sumathi
2Sibon Li, Wai Lok
formatjournal
genrearticle
ristypeJOUR
atitleTesting the Reliability of Genetic Methods of Species Identification via Simulation
jtitleSystematic biology
addtitleSyst Biol
date2008-04
risdate2008
volume57
issue2
spage216
epage230
pages216-230
issn1063-5157
eissn1076-836X
abstractAlthough genetic methods of species identification, especially DNA barcoding, are strongly debated, tests of these methods have been restricted to a few empirical cases for pragmatic reasons. Here we use simulation to test the performance of methods based on sequence comparison (BLAST and genetic distance) and tree topology over a wide range of evolutionary scenarios. Sequences were simulated on a range of gene trees spanning almost three orders of magnitude in tree depth and in coalescent depth; that is, deep or shallow trees with deep or shallow coalescences. When the query's conspecific sequences were included in the reference alignment, the rate of positive identification was related to the degree to which different species were genetically differentiated. The BLAST, distance, and liberal tree-based methods returned higher rates of correct identification than did the strict tree-based requirement that the query was within, but not sister to, a single-species clade. Under this more conservative approach, ambiguous outcomes occurred in inverse proportion to the number of reference sequences per species. When the query's conspecific sequences were not in the reference alignment, only the strict tree-based approach was relatively immune to making false-positive identifications. Thresholds affected the rates at which false-positive identifications were made when the query's species was unrepresented in the reference alignment but did not otherwise influence outcomes. A conservative approach using the strict tree-based method should be used initially in large-scale identification systems, with effort made to maximize sequence sampling within species. Once the genetic variation within a taxonomic group is well characterized and the taxonomy resolved, then the choice of method used should be dictated by considerations of computational efficiency. The requirement for extensive genetic sampling may render these techniques inappropriate in some circumstances.
copEngland
pubTaylor & Francis
pmid18398767
doi10.1080/10635150802032990
oafree_for_read