Why neural networks should not be used for HIV-1 protease cleavage site prediction, Bioinformatics 20(11

来自 Citeseer

阅读量:

33

作者:

LY Thorsteinn Rögnvaldsson

展开

摘要:

Several papers have been published where nonlinear machine learning algorithms like, e.g., artificial neural networks, support vector machines and decision trees have been used to model the specificity of the HIV-1 protease and extract specificity rules. We show that the data set used in these studies is linearly separable and that it is a misuse of nonlinear classifiers to apply them to this problem. The best solution on this data set is achieved using a linear classifier like the simple perceptron or the linear support vector machine, and it is straightforward to extract rules from the learned linear models. We identify key residues in peptides that are efficiently cleaved by the HIV-1 protease and list the most prominent rules, relating them to experimental results for the HIV-1 protease. Motivation: Understanding HIV-1 protease specificity is important when designing HIV inhibitors and several different machine learning algorithms have been applied to the problem. However, little progress has been made in understanding the specificity because nonlinear and overly complex models have been used. Results: We show that the problem is much easier than what has previously been reported and that linear classifiers like the simple perceptron or linear support vector machines are at least as good predictors as nonlinear algorithms. We also show how sets of specificity rules can be generated from the resulting linear classifiers.

展开

被引量:

60

年份:

2004

通过文献互助平台发起求助,成功后即可免费获取论文全文。

相似文献

参考文献

引证文献

引用走势

2005
被引量:9

站内活动

辅助模式

0

引用

文献可以批量引用啦~
欢迎点我试用!

引用