高级检索

    基于直觉模糊深度自动编码器的特征选择算法

    Intuitionistic Fuzzy Deep Autoencoder Approach for Feature Selection

    • 摘要: 高维数据集在多个领域的应用愈发广泛,其维度高,冗余多的特征易引发模型过拟合并增加计算开销。特征选择有助于从中提取有效特征,降低维度并提升模型的可解释性。尽管深度神经网络在特征提取方面表现出潜力,但在面对噪声和离群值时,其性能易受影响。为了解决这些挑战,提出一种新的基于直觉模糊深度自动编码器的特征选择方法(IFDAE)。该方法首先通过融合隶属度和非隶属度信息,应用直觉模糊权重来处理不确定性,从而抑制噪声和离群值的影响;然后,充分学习深度自动编码器中高度非线性的潜在表示,利用权重迁移将预训练知识迁入目标网络,通过结构化稀疏范数对输入层与首个隐藏层之间的连接权重矩阵施加正则项来获得结构稀疏权值矩阵,从而获得各特征的权值。最后,为了验证所提出IFDAE方法的有效性,在12个公共数据集和1个真实世界的精神分裂症数据集上进行实验,结果表明,所提方法在重建能力和分类性能方面具有较高的优越性。

       

      Abstract: With the continuous development of big data technology, high-dimensional datasets have been increasingly widely applied in various fields. These datasets typically feature high dimensionality and excessive redundancy, which tend to induce model overfitting and increase computational costs. Feature selection helps extract an informative subset from them, reducing dimensionality and enhancing model interpretability. Although deep neural networks have shown potential in feature extraction, their performance is vulnerable to noise and outliers. To address these challenges, a novel feature selection method based on the intuitionistic fuzzy deep autoencoder (IFDAE) is proposed. First, the method fuses membership and non-membership degree information and applies intuitionistic fuzzy weights to handle uncertainty, thereby suppressing the impact of noise and outliers. Then, it fully learns the highly non-linear latent representations in the deep autoencoder, uses weight transfer to migrate pre-trained knowledge into the target network, and imposes a regularizer on the connection weight matrix between the input layer and the first hidden layer via the structured sparse norm to obtain a structurally sparse weight matrix, thus deriving the weight of each feature. Finally, to verify the effectiveness of the proposed IFDAE method, experiments are conducted on 12 public datasets and 1 real-world schizophrenia dataset. The results demonstrate that the proposed method exhibits significant superiority in both reconstruction capability and classification performance.

       

    /

    返回文章
    返回