schliessen

Filtern

 

Bibliotheken

MRFalign: Protein Homology Detection through Alignment of Markov Random Fields (Protein Homology Detection Using MRF-MRF Alignment)

Sequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represen... Full description

Journal Title: 2014 Vol.10(3), p.e1003500
Main Author: Ma, Jianzhu
Other Authors: Wang, Sheng , Wang, Zhiyong , Xu, Jinbo
Format: Electronic Article Electronic Article
Language: English
Subjects:
ID: ISSN: 1553-734X ; E-ISSN: 1553-7358 ; DOI: 10.1371/journal.pcbi.1003500
Zum Text:
SendSend as email Add to Book BagAdd to Book Bag
Staff View
recordid: plos10.1371/journal.pcbi.1003500
title: MRFalign: Protein Homology Detection through Alignment of Markov Random Fields (Protein Homology Detection Using MRF-MRF Alignment)
format: Article
creator:
  • Ma, Jianzhu
  • Wang, Sheng
  • Wang, Zhiyong
  • Xu, Jinbo
subjects:
  • Research Article
  • Biology
ispartof: 2014, Vol.10(3), p.e1003500
description: Sequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/ . A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. ; Sequence-based protein homology detection has been extensively studied, but it remains very challenging for remote homologs with divergent sequences. So far the most sensitive methods employ HMM-HMM comparison, which models a protein family using HMM (Hidden Markov Model) and then detects homologs using HMM-HMM alignment. HMM cannot model long-range residue interaction patterns and thus, carries very little information regarding the global 3D structure of a protein family. As such, HMM comparison is not sensitiv
language: eng
source:
identifier: ISSN: 1553-734X ; E-ISSN: 1553-7358 ; DOI: 10.1371/journal.pcbi.1003500
fulltext: fulltext
issn:
  • 1553-734X
  • 1553-7358
  • 1553734X
  • 15537358
url: Link


@attributes
ID1192937449
RANK0.07
NO1
SEARCH_ENGINEprimo_central_multiple_fe
SEARCH_ENGINE_TYPEPrimo Central Search Engine
LOCALfalse
PrimoNMBib
record
control
sourcerecordid10.1371/journal.pcbi.1003500
sourceidplos
recordidTN_plos10.1371/journal.pcbi.1003500
sourcesystemPC
pqid1872825629
galeid366236130
display
typearticle
titleMRFalign: Protein Homology Detection through Alignment of Markov Random Fields (Protein Homology Detection Using MRF-MRF Alignment)
creatorMa, Jianzhu ; Wang, Sheng ; Wang, Zhiyong ; Xu, Jinbo
contributorLengauer, Thomas (Editor)
ispartof2014, Vol.10(3), p.e1003500
identifier
subjectResearch Article ; Biology
descriptionSequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/ . A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. ; Sequence-based protein homology detection has been extensively studied, but it remains very challenging for remote homologs with divergent sequences. So far the most sensitive methods employ HMM-HMM comparison, which models a protein family using HMM (Hidden Markov Model) and then detects homologs using HMM-HMM alignment. HMM cannot model long-range residue interaction patterns and thus, carries very little information regarding the global 3D structure of a protein family. As such, HMM comparison is not sensitive enough for distantly-related homologs. In this paper, we present an MRF-MRF comparison method for homology detection. In particular, we model a protein family using Markov Random Fields (MRF) and then detect homologs by MRF-MRF alignment. Compared to HMM, MRFs are able to model long-range residue interaction pattern and thus, contains information for the overall 3D structure of a protein family. Consequently, MRF-MRF comparison is much more sensitive than HMM-HMM comparison. To implement MRF-MRF comparison, we have developed a new scoring function to measure the similarity of two MRFs and also an efficient ADMM algorithm to optimize the scoring function. Experiments confirm that MRF-MRF comparison indeed outperforms HMM-HMM comparison in terms of both alignment accuracy and remote homology detection, especially for mainly beta proteins.
languageeng
source
version11
lds50peer_reviewed
links
openurl$$Topenurl_article
openurlfulltext$$Topenurlfull_article
search
creatorcontrib
0Ma, Jianzhu
1Wang, Sheng
2Wang, Zhiyong
3Xu, Jinbo
4Lengauer, Thomas (Editor)
titleMRFalign: Protein Homology Detection through Alignment of Markov Random Fields (Protein Homology Detection Using MRF-MRF Alignment)
descriptionSequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/ . A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. ; Sequence-based protein homology detection has been extensively studied, but it remains very challenging for remote homologs with divergent sequences. So far the most sensitive methods employ HMM-HMM comparison, which models a protein family using HMM (Hidden Markov Model) and then detects homologs using HMM-HMM alignment. HMM cannot model long-range residue interaction patterns and thus, carries very little information regarding the global 3D structure of a protein family. As such, HMM comparison is not sensitive enough for distantly-related homologs. In this paper, we present an MRF-MRF comparison method for homology detection. In particular, we model a protein family using Markov Random Fields (MRF) and then detect homologs by MRF-MRF alignment. Compared to HMM, MRFs are able to model long-range residue interaction pattern and thus, contains information for the overall 3D structure of a protein family. Consequently, MRF-MRF comparison is much more sensitive than HMM-HMM comparison. To implement MRF-MRF comparison, we have developed a new scoring function to measure the similarity of two MRFs and also an efficient ADMM algorithm to optimize the scoring function. Experiments confirm that MRF-MRF comparison indeed outperforms HMM-HMM comparison in terms of both alignment accuracy and remote homology detection, especially for mainly beta proteins.
subject
0Research Article
1Biology
general
010.1371/journal.pcbi.1003500
1English
sourceidplos
recordidplos10.1371/journal.pcbi.1003500
issn
01553-734X
11553-7358
21553734X
315537358
rsrctypearticle
creationdate2014
recordtypearticle
addtitleProtein Homology Detection Using MRF-MRF Alignment
searchscopeplos
scopeplos
lsr30VSR-Enriched:[pqid, galeid]
sort
titleMRFalign: Protein Homology Detection through Alignment of Markov Random Fields (Protein Homology Detection Using MRF-MRF Alignment)
authorMa, Jianzhu ; Wang, Sheng ; Wang, Zhiyong ; Xu, Jinbo
creationdate20140327
facets
frbrgroupid-2331818524607784635
frbrtype5
languageeng
creationdate2014
topic
0Research Article
1Biology
collectionPLoS
prefilterarticles
rsrctypearticles
creatorcontrib
0Ma, Jianzhu
1Wang, Sheng
2Wang, Zhiyong
3Xu, Jinbo
4Lengauer, Thomas
toplevelpeer_reviewed
delivery
delcategoryRemote Search Resource
fulltextfulltext
addata
aulast
0Ma
1Wang
2Xu
3Lengauer
aufirst
0Jianzhu
1Sheng
2Zhiyong
3Jinbo
4Thomas
au
0Ma, Jianzhu
1Wang, Sheng
2Wang, Zhiyong
3Xu, Jinbo
addauLengauer, Thomas
atitleMRFalign: Protein Homology Detection through Alignment of Markov Random Fields (Protein Homology Detection Using MRF-MRF Alignment)
risdate20140327
volume10
issue3
spagee1003500
pagese1003500
issn1553-734X
eissn1553-7358
genrearticle
ristypeJOUR
abstractSequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/ . A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. ; Sequence-based protein homology detection has been extensively studied, but it remains very challenging for remote homologs with divergent sequences. So far the most sensitive methods employ HMM-HMM comparison, which models a protein family using HMM (Hidden Markov Model) and then detects homologs using HMM-HMM alignment. HMM cannot model long-range residue interaction patterns and thus, carries very little information regarding the global 3D structure of a protein family. As such, HMM comparison is not sensitive enough for distantly-related homologs. In this paper, we present an MRF-MRF comparison method for homology detection. In particular, we model a protein family using Markov Random Fields (MRF) and then detect homologs by MRF-MRF alignment. Compared to HMM, MRFs are able to model long-range residue interaction pattern and thus, contains information for the overall 3D structure of a protein family. Consequently, MRF-MRF comparison is much more sensitive than HMM-HMM comparison. To implement MRF-MRF comparison, we have developed a new scoring function to measure the similarity of two MRFs and also an efficient ADMM algorithm to optimize the scoring function. Experiments confirm that MRF-MRF comparison indeed outperforms HMM-HMM comparison in terms of both alignment accuracy and remote homology detection, especially for mainly beta proteins.
copSan Francisco, USA
pubPublic Library of Science
doi10.1371/journal.pcbi.1003500
oafree_for_read
date2014-03-27