ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (8): 1617-1626.doi: 10.7544/issn1000-1239.2020.20200496

所属专题: 2020数据挖掘与知识发现专题

• 人工智能 • 上一篇    下一篇



  1. 1(山西大学数学科学学院 太原 030006);2(计算智能与中文信息处理教育部重点实验室(山西大学) 太原 030006) (
  • 出版日期: 2020-08-01
  • 基金资助: 
    国家自然科学基金项目(61807022, 61876103, 61976184);山西省重点研发计划项目(201903D121162);山西省自然科学基金项目(201801D221168)

Linear Regularized Functional Logistic Model

Meng Yinfeng1, Liang Jiye2   

  1. 1(School of Mathematical Sciences, Shanxi University, Taiyuan 030006);2(Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University), Ministry of Education, Taiyuan 030006)
  • Online: 2020-08-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61807022, 61876103, 61976184), the Projects of Key Research and Development Plan of Shanxi Province (201903D121162), and the Natural Science Foundation of Shanxi Province of China (201801D221168).

摘要: 函数型数据的模式识别问题广泛存在于医学、经济、金融、生物、气象等各个领域,探索更具泛化性能的分类器对准确挖掘函数型数据当中隐藏的知识至关重要.针对经典函数Logistic模型的泛化性能不高的问题,提出了线性正则化函数Logistic模型,该模型的生成通过求解一个优化问题实现.在该优化问题当中,前项是基于函数样例的似然函数构造的,用于控制函数样例的分类性能;后项是正则化项,用于控制模型的复杂性.同时,这2项进行了线性加权组合,这样,限制了正则化子的取值范围,方便给出一个经验最优参数,然后可在这一经验最优参数的指导下选出一个适当的函数主成分基个数下的Logistic模型用于函数型数据的分类.实验结果表明:选出的线性正则化函数Logistic模型的泛化性能优于经典的函数Logistic模型.

关键词: 函数型数据, 函数主成分分析, 基表示, Logistic回归, 线性正则化

Abstract: The pattern recognition problems of functional data widely exist in various fields such as medicine, economy, finance, biology and meteorology, therefore, to explore classifiers with more better generalized performance is critical to accurately mining the hidden knowledge in functional data. Aiming at the low generalization performance of the classical functional logistic model, a linear regularized functional logistic model based on functional principal component representation is proposed and the model is acquired by means of solving an optimization problem. In the optimization problem, the former term is constructed based on the likelihood function of training functional samples to control the classification performance of functional samples. The latter term is the regularization term, which is used to control the complexity of the model. At the same time, the two terms are combined by linear weighted combination, which limits the value range of the regularizer and makes it convenient to give an empirical optimal parameter. Then, under the guidance of this empirical optimal parameter, a logistic model with the appropriate number of principal components can be selected for the classification of functional data. The experimental results show that the generalization performance of the selected linear regularized functional logistic model is better than that of the classical logistic model.

Key words: functional data, functional principal component analysis, basis representation, logistic regression, linear regularization