schliessen

Filtern

 

Bibliotheken

APPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments

Abstract Placing a new species on an existing phylogeny has increasing relevance to several applications. Placement can be used to update phylogenies in a scalable fashion and can help identify unknown query samples using (meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML) metho... Full description

Journal Title: Systematic biology 2020, Vol.69 (3), p.566-578
Main Author: Balaban, Metin
Other Authors: Sarmashghi, Shahab , Mirarab, Siavash
Format: Electronic Article Electronic Article
Language: English
Subjects:
Publisher: England: Oxford University Press
ID: ISSN: 1063-5157
Link: https://www.ncbi.nlm.nih.gov/pubmed/31545363
Zum Text:
SendSend as email Add to Book BagAdd to Book Bag
Staff View
recordid: cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7164367
title: APPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments
format: Article
creator:
  • Balaban, Metin
  • Sarmashghi, Shahab
  • Mirarab, Siavash
subjects:
  • Algorithms
  • Base Sequence
  • Classification - methods
  • Distance-based methods
  • genome skimming
  • phylogenetic placement
  • Phylogeny
  • Software
  • Software for Systematics and Evolution
ispartof: Systematic biology, 2020, Vol.69 (3), p.566-578
description: Abstract Placing a new species on an existing phylogeny has increasing relevance to several applications. Placement can be used to update phylogenies in a scalable fashion and can help identify unknown query samples using (meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to reference trees with many thousands of leaves, limiting their ability to enjoy benefits of dense taxon sampling in modern reference libraries. They also rely on assembled sequences for the reference set and aligned sequences for the query. Thus, ML methods cannot analyze data sets where the reference consists of unassembled reads, a scenario relevant to emerging applications of genome skimming for sample identification. We introduce APPLES, a distance-based method for phylogenetic placement. Compared to ML, APPLES is an order of magnitude faster and more memory efficient, and unlike ML, it is able to place on large backbone trees (tested for up to 200,000 leaves). We show that using dense references improves accuracy substantially so that APPLES on dense trees is more accurate than ML on sparser trees, where it can run. Finally, APPLES can accurately identify samples without assembled reference or aligned queries using kmer-based distances, a scenario that ML cannot handle. APPLES is available publically at github.com/balabanmetin/apples.
language: eng
source:
identifier: ISSN: 1063-5157
fulltext: no_fulltext
issn:
  • 1063-5157
  • 1076-836X
url: Link


@attributes
NO1
SEARCH_ENGINEprimo_central_multiple_fe
SEARCH_ENGINE_TYPEPrimo Central Search Engine
RANK2.6535666
LOCALfalse
PrimoNMBib
record
control
sourceidproquest_pubme
recordidTN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7164367
sourceformatXML
sourcesystemPC
oup_id10.1093/sysbio/syz063
sourcerecordid2296124802
originalsourceidFETCH-LOGICAL-1409t-d1ec36498549d48ea7a9fa38082d5bab4d9b196eaf7620fe7cfce8307f7403480
addsrcrecordideNqFkb1PwzAQxS0EolAYWVFGloAdO3bCgFRK-ZAqUVSQ2CzHubRGblziBFT-elJaCkxM76R7er_TPYSOCD4lOKVnfuEz41r5wJxuoT2CBQ8Typ-3lzOnYUxi0UH73r9gTAiPyS7qUBKzmHK6hx56o9FwMD4PxlpZlVkIroyvVakhvFQe8mA0XVg3gRJqo4ORVRpmUNbBu6mngau-1DV10LNmUi43_gDtFMp6OFxrFz1dDx77t-Hw_uau3xuGhOG0DnMCmnKWJjFLc5aAEiotFE1wEuVxpjKWpxlJOahC8AgXIHShIaFYFIJhyhLcRRer3HmTzSDXLbtSVs4rM1PVQjpl5N9NaaZy4t6kIJxRLtqAk3VA5V4b8LWcGa_BWlWCa7yMopSTqCVFrTVcWXXlvK-g2GAIlssa5KoGuaqh9R__vm3j_v77D9s183-yPgGxfJUu
sourcetypeOpen Access Repository
isCDItrue
recordtypearticle
pqid2296124802
display
typearticle
titleAPPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments
creatorBalaban, Metin ; Sarmashghi, Shahab ; Mirarab, Siavash
contributorPosada, David
creatorcontribBalaban, Metin ; Sarmashghi, Shahab ; Mirarab, Siavash ; Posada, David
descriptionAbstract Placing a new species on an existing phylogeny has increasing relevance to several applications. Placement can be used to update phylogenies in a scalable fashion and can help identify unknown query samples using (meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to reference trees with many thousands of leaves, limiting their ability to enjoy benefits of dense taxon sampling in modern reference libraries. They also rely on assembled sequences for the reference set and aligned sequences for the query. Thus, ML methods cannot analyze data sets where the reference consists of unassembled reads, a scenario relevant to emerging applications of genome skimming for sample identification. We introduce APPLES, a distance-based method for phylogenetic placement. Compared to ML, APPLES is an order of magnitude faster and more memory efficient, and unlike ML, it is able to place on large backbone trees (tested for up to 200,000 leaves). We show that using dense references improves accuracy substantially so that APPLES on dense trees is more accurate than ML on sparser trees, where it can run. Finally, APPLES can accurately identify samples without assembled reference or aligned queries using kmer-based distances, a scenario that ML cannot handle. APPLES is available publically at github.com/balabanmetin/apples.
identifier
0ISSN: 1063-5157
1EISSN: 1076-836X
2DOI: 10.1093/sysbio/syz063
3PMID: 31545363
languageeng
publisherEngland: Oxford University Press
subjectAlgorithms ; Base Sequence ; Classification - methods ; Distance-based methods ; genome skimming ; phylogenetic placement ; Phylogeny ; Software ; Software for Systematics and Evolution
ispartofSystematic biology, 2020, Vol.69 (3), p.566-578
rights
0The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: journals.permissions@oup.com 2019
1The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
2The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: 2019
lds50peer_reviewed
oafree_for_read
citesFETCH-LOGICAL-1409t-d1ec36498549d48ea7a9fa38082d5bab4d9b196eaf7620fe7cfce8307f7403480
links
openurl$$Topenurl_article
thumbnail$$Usyndetics_thumb_exl
backlink$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31545363$$D View this record in MEDLINE/PubMed
search
contributorPosada, David
creatorcontrib
0Balaban, Metin
1Sarmashghi, Shahab
2Mirarab, Siavash
title
0APPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments
1Systematic biology
addtitleSyst Biol
descriptionAbstract Placing a new species on an existing phylogeny has increasing relevance to several applications. Placement can be used to update phylogenies in a scalable fashion and can help identify unknown query samples using (meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to reference trees with many thousands of leaves, limiting their ability to enjoy benefits of dense taxon sampling in modern reference libraries. They also rely on assembled sequences for the reference set and aligned sequences for the query. Thus, ML methods cannot analyze data sets where the reference consists of unassembled reads, a scenario relevant to emerging applications of genome skimming for sample identification. We introduce APPLES, a distance-based method for phylogenetic placement. Compared to ML, APPLES is an order of magnitude faster and more memory efficient, and unlike ML, it is able to place on large backbone trees (tested for up to 200,000 leaves). We show that using dense references improves accuracy substantially so that APPLES on dense trees is more accurate than ML on sparser trees, where it can run. Finally, APPLES can accurately identify samples without assembled reference or aligned queries using kmer-based distances, a scenario that ML cannot handle. APPLES is available publically at github.com/balabanmetin/apples.
subject
0Algorithms
1Base Sequence
2Classification - methods
3Distance-based methods
4genome skimming
5phylogenetic placement
6Phylogeny
7Software
8Software for Systematics and Evolution
issn
01063-5157
11076-836X
fulltextfalse
rsrctypearticle
creationdate2020
recordtypearticle
recordideNqFkb1PwzAQxS0EolAYWVFGloAdO3bCgFRK-ZAqUVSQ2CzHubRGblziBFT-elJaCkxM76R7er_TPYSOCD4lOKVnfuEz41r5wJxuoT2CBQ8Typ-3lzOnYUxi0UH73r9gTAiPyS7qUBKzmHK6hx56o9FwMD4PxlpZlVkIroyvVakhvFQe8mA0XVg3gRJqo4ORVRpmUNbBu6mngau-1DV10LNmUi43_gDtFMp6OFxrFz1dDx77t-Hw_uau3xuGhOG0DnMCmnKWJjFLc5aAEiotFE1wEuVxpjKWpxlJOahC8AgXIHShIaFYFIJhyhLcRRer3HmTzSDXLbtSVs4rM1PVQjpl5N9NaaZy4t6kIJxRLtqAk3VA5V4b8LWcGa_BWlWCa7yMopSTqCVFrTVcWXXlvK-g2GAIlssa5KoGuaqh9R__vm3j_v77D9s183-yPgGxfJUu
startdate20200501
enddate20200501
creator
0Balaban, Metin
1Sarmashghi, Shahab
2Mirarab, Siavash
generalOxford University Press
scope
0CGR
1CUY
2CVF
3ECM
4EIF
5NPM
6AAYXX
7CITATION
87X8
95PM
sort
creationdate20200501
titleAPPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments
authorBalaban, Metin ; Sarmashghi, Shahab ; Mirarab, Siavash
facets
frbrtype5
frbrgroupidcdi_FETCH-LOGICAL-1409t-d1ec36498549d48ea7a9fa38082d5bab4d9b196eaf7620fe7cfce8307f7403480
rsrctypearticles
prefilterarticles
languageeng
creationdate2020
topic
0Algorithms
1Base Sequence
2Classification - methods
3Distance-based methods
4genome skimming
5phylogenetic placement
6Phylogeny
7Software
8Software for Systematics and Evolution
toplevelpeer_reviewed
creatorcontrib
0Balaban, Metin
1Sarmashghi, Shahab
2Mirarab, Siavash
collection
0Medline
1MEDLINE
2MEDLINE (Ovid)
3MEDLINE
4MEDLINE
5PubMed
6CrossRef
7MEDLINE - Academic
8PubMed Central (Full Participant titles)
jtitleSystematic biology
delivery
delcategoryRemote Search Resource
fulltextno_fulltext
addata
au
0Balaban, Metin
1Sarmashghi, Shahab
2Mirarab, Siavash
formatjournal
genrearticle
ristypeJOUR
atitleAPPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments
jtitleSystematic biology
addtitleSyst Biol
date2020-05-01
risdate2020
volume69
issue3
spage566
epage578
pages566-578
issn1063-5157
eissn1076-836X
abstractAbstract Placing a new species on an existing phylogeny has increasing relevance to several applications. Placement can be used to update phylogenies in a scalable fashion and can help identify unknown query samples using (meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to reference trees with many thousands of leaves, limiting their ability to enjoy benefits of dense taxon sampling in modern reference libraries. They also rely on assembled sequences for the reference set and aligned sequences for the query. Thus, ML methods cannot analyze data sets where the reference consists of unassembled reads, a scenario relevant to emerging applications of genome skimming for sample identification. We introduce APPLES, a distance-based method for phylogenetic placement. Compared to ML, APPLES is an order of magnitude faster and more memory efficient, and unlike ML, it is able to place on large backbone trees (tested for up to 200,000 leaves). We show that using dense references improves accuracy substantially so that APPLES on dense trees is more accurate than ML on sparser trees, where it can run. Finally, APPLES can accurately identify samples without assembled reference or aligned queries using kmer-based distances, a scenario that ML cannot handle. APPLES is available publically at github.com/balabanmetin/apples.
copEngland
pubOxford University Press
pmid31545363
doi10.1093/sysbio/syz063
oafree_for_read