Gene selection using support vector machines with non-convex penaltyстатья из журнала
Аннотация: With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes simultaneously in one single experiment. One current difficulty in interpreting microarray data comes from their innate nature of 'high-dimensional low sample size'. Therefore, robust and accurate gene selection methods are required to identify differentially expressed group of genes across different samples, e.g. between cancerous and normal cells. Successful gene selection will help to classify different cancer types, lead to a better understanding of genetic signatures in cancers and improve treatment strategies. Although gene selection and cancer classification are two closely related problems, most existing approaches handle them separately by selecting genes prior to classification. We provide a unified procedure for simultaneous gene selection and cancer classification, achieving high accuracy in both aspects.In this paper we develop a novel type of regularization in support vector machines (SVMs) to identify important genes for cancer classification. A special nonconvex penalty, called the smoothly clipped absolute deviation penalty, is imposed on the hinge loss function in the SVM. By systematically thresholding small estimates to zeros, the new procedure eliminates redundant genes automatically and yields a compact and accurate classifier. A successive quadratic algorithm is proposed to convert the non-differentiable and non-convex optimization problem into easily solved linear equation systems. The method is applied to two real datasets and has produced very promising results.MATLAB codes are available upon request from the authors.
Год издания: 2005
Авторы: Hao Helen Zhang, Jeongyoun Ahn, X. Sheldon Lin, Cheolwoo Park
Издательство: Oxford University Press
Источник: Bioinformatics
Ключевые слова: Gene expression and cancer classification, Machine Learning in Bioinformatics, Bioinformatics and Genomic Networks
Другие ссылки: Bioinformatics (PDF)
Bioinformatics (HTML)
PubMed (HTML)
Bioinformatics (HTML)
PubMed (HTML)
Открытый доступ: bronze
Том: 22
Выпуск: 1
Страницы: 88–95