A topological algorithm for identification of structural domains of proteins

来自 NCBI

阅读量:

34

作者:

F Emmert-StreibA Mushegian

展开

摘要:

p pBackground/p pIdentification of the structural domains of proteins is important for our understanding of the organizational principles and mechanisms of protein folding, and for insights into protein function and evolution. Algorithmic methods of dissecting protein of known structure into domains developed so far are based on an examination of multiple geometrical, physical and topological features. Successful as many of these approaches are, they employ a lot of heuristics, and it is not clear whether they illuminate any deep underlying principles of protein domain organization. Other well-performing domain dissection methods rely on comparative sequence analysis. These methods are applicable to sequences with known and unknown structure alike, and their success highlights a fundamental principle of protein modularity, but this does not directly improve our understanding of protein spatial structure./p pResults/p pWe present a novel graph-theoretical algorithm for the identification of domains in proteins with known three-dimensional structure. We represent the protein structure as an undirected, unweighted and unlabeled graph whose nodes correspond to the secondary structure elements and edges represent physical proximity of at least one pair of alpha carbon atoms from two elements. Domains are identified as constrained partitions of the graph, corresponding to sets of vertices obtained by the maximization of the cycle distributions found in the graph. When a partition is found, the algorithm is iteratively applied to each of the resulting subgraphs. The decision to accept or reject a tentative cut position is based on a specific classifier. The algorithm is applied iteratively to each of the resulting subgraphs and terminates automatically if partitions are no longer accepted. The distribution of cycles is the only type of information on which the decision about protein dissection is based. Despite the barebone simplicity of the approach, our algorithm approaches the best heuristic algorithms in accuracy./p pConclusion/p pOur graph-theoretical algorithm uses only topological information present in the protein structure itself to find the domains and does not rely on any geometrical or physical information about protein molecule. Perhaps unexpectedly, these drastic constraints on resources, which result in a seemingly approximate description of protein structures and leave only a handful of parameters available for analysis, do not lead to any significant deterioration of algorithm accuracy. It appears that protein structures can be rigorously treated as topological rather than geometrical objects and that the majority of information about protein domains can be inferred from the coarse-grained measure of pairwise proximity between elements of secondary structure elements./p

展开

DOI:

10.1186/1471-2105-8-237

被引量:

41

年份:

2007

通过文献互助平台发起求助,成功后即可免费获取论文全文。

相似文献

参考文献

引证文献

来源期刊

BMC Bioinformatics
2007-07-01

引用走势

2011
被引量:11

辅助模式

0

引用

文献可以批量引用啦~
欢迎点我试用!

关于我们

百度学术集成海量学术资源,融合人工智能、深度学习、大数据分析等技术,为科研工作者提供全面快捷的学术服务。在这里我们保持学习的态度,不忘初心,砥砺前行。
了解更多>>

友情链接

百度云百度翻译

联系我们

合作与服务

期刊合作 图书馆合作 下载产品手册

©2025 Baidu 百度学术声明 使用百度前必读

引用