高级检索
    吴建盛, 冯巧遇, 袁京洲, 胡海峰, 周家特, 高昊. 基于快速多示例多标记学习的G蛋白偶联受体生物学功能预测[J]. 计算机研究与发展, 2018, 55(8): 1674-1682. DOI: 10.7544/issn1000-1239.2018.20180361
    引用本文: 吴建盛, 冯巧遇, 袁京洲, 胡海峰, 周家特, 高昊. 基于快速多示例多标记学习的G蛋白偶联受体生物学功能预测[J]. 计算机研究与发展, 2018, 55(8): 1674-1682. DOI: 10.7544/issn1000-1239.2018.20180361
    Wu Jiansheng, Feng Qiaoyu, Yuan Jingzhou, Hu Haifeng, Zhou Jiate, Gao Hao. Predicting Biological Functions of G Protein-Coupled Receptors Based on Fast Multi-Instance Multi-Label Learning[J]. Journal of Computer Research and Development, 2018, 55(8): 1674-1682. DOI: 10.7544/issn1000-1239.2018.20180361
    Citation: Wu Jiansheng, Feng Qiaoyu, Yuan Jingzhou, Hu Haifeng, Zhou Jiate, Gao Hao. Predicting Biological Functions of G Protein-Coupled Receptors Based on Fast Multi-Instance Multi-Label Learning[J]. Journal of Computer Research and Development, 2018, 55(8): 1674-1682. DOI: 10.7544/issn1000-1239.2018.20180361

    基于快速多示例多标记学习的G蛋白偶联受体生物学功能预测

    Predicting Biological Functions of G Protein-Coupled Receptors Based on Fast Multi-Instance Multi-Label Learning

    • 摘要: G蛋白偶联受体(G protein-coupled receptors, GPCRs)是人类中最庞大的膜蛋白家族,也是很多药物的重要靶点,准确了解GPCRs生物学功能是理解它们参与的生物学过程及其药物作用机制的关键.以前的研究表明,蛋白质功能预测可抽象为多示例多标记学习(multi-instance multi-label learning, MIML)问题.设计了一种基于快速多示例多标记学习方法MIMLfast的GPCRs生物学功能预测模型.该模型采用了一种新的混合特征,它考虑了GPCRs结构域的三联氨基酸、氨基酸关联、进化、二级结构关联、信号肽及无序残基等多种信息.实验结果证明,该模型获得了很好的性能,优于目前最优的多示例多标记学习、多标记学习的预测方法和CAFA蛋白质功能预测方法.

       

      Abstract: G protein-coupled receptors (GPCRs) constitute the largest family among human membrane proteins which are the important targets of many drugs on the market. An accurate annotation of the biological functions of GPCR proteins is key to understand their involved biological processes and drug-acting mechanisms. In our previous work, we found that protein function prediction problem can be formulated as a multi-instance multi-label learning (MIML) task. In this paper, we propose a novel method for predicting biological functions of G protein-coupled receptors by using a fast MIML learning called MIMLfast along with a hybrid feature. The hybrid feature consists of amino acid triple information, amino acid correlation information, evolutionary information, secondary structure correlation information, signal peptide information, disordered residue information, physical and chemical properties among GPCR domains. The experimental results show that our method achieves good performance which is superior to state-of-the-art multi-instance multi-label learning methods, multi-label learning methods and CAFA protein function prediction methods.

       

    /

    返回文章
    返回