高级检索

    基于支持向量机的肿瘤分类特征基因选取

    Feature Selection for Cancer Classification Based on Support Vector Machine

    • 摘要: 依据基因表达谱有效建立肿瘤分类模型的关键在于准确找出决定样本类别的一组特征基因.针对该问题,在分析肿瘤基因表达谱特征的基础上,研究了肿瘤分类特征基因选取问题.首先,提出了一种新的类别可分性判据以滤除分类无关基因,并采用支持向量机作为分类器进行特征基因分类性能的检验.然后,采用两两冗余分析及基于支持向量机分类模型的灵敏度分析法进行冗余基因的剔除.以急性白血病亚型分类特征基因选取为例进行实验,结果表明了上述方法的可行性和有效性.

       

      Abstract: Feature selection is an essential step to perform cancer classification with DNA microarrays, for there are a large number of genes from which to predict classes and a relatively small number of samples. This work addresses the problem of selection of a small subset of genes for classification from broad patterns of gene expression profiles by proposing a two-step feature selection method. The first step uses a new metric proposed in this paper as the criteria for class separability to remove the genes irrelevant to the classification task, and then a support vector machine with radial basis function kernel is applied to validate the classification performance of the genes selected for distinguishing different tissue types. The second step filters out the redundant genes by the sensitivity analysis based on the support vector machine classifier after pair-wise redundancy analysis. The two steps are applied to the gene expression profiles of human acute leukemia, and a better and more compact gene subset is obtained in contrast with the baseline method, which shows the feasibility and effectiveness of the method proposed.

       

    /

    返回文章
    返回