A two step method to identify clinical outcome relevant genes with microarray data
摘要:
With advances in microarray technology, many biomarkers selection approaches have been proposed for cancer diagnosis. Marker sets are selected by scoring genes for how well they can discriminate between different classes of diseases [1–4] or are ranked by significance analysis without reference to classification tasks. However there is a pressing need for methods integrating biological priori knowledge in the gene selection process. In this study, we proposed to identify genes primarily in terms of diagnostic outcome relevance. As gene expression is a combination effect, with the help of SVD, the microarray data is decomposed, the eigenvectors correspond to the biological effect of clinical outcomes are identified. Genes which play important roles in determining this biological effect are detected. Therefore, genes are essentially identified in terms of the strength of association with clinical outcomes and the relationship of genes and clinical outcomes is analyzed. Monte Carlo simulations are then used to fine tune the selected gene set in terms of classification accuracy. The approach was tested on four public data sets. Comparative studies show that the selected genes achieved higher classification accuracies. Graphical analysis visualizes that they have close relationship with the cancer class. Statistical simulation shows that the gene set found by the proposed method is also less variable and comparatively invariant to external influences. The biological relevance of the selected genes is further discussed and validated with the literature study and analysis of biological databases.
展开
DOI:
10.1016/j.jbi.2010.11.007
被引量:
年份:
2011
































通过文献互助平台发起求助,成功后即可免费获取论文全文。
相似文献
参考文献
引证文献
辅助模式
引用
文献可以批量引用啦~
欢迎点我试用!