ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2022, Vol. 59 ›› Issue (4): 936-949.doi: 10.7544/issn1000-1239.20200879

• 人工智能 • 上一篇    



  1. 1(陆军工程大学研究生院  南京  210007);2(陆军工程大学指挥控制工程学院  南京  210007);3(火箭军士官学校  山东青州  262500)  (
  • 出版日期: 2022-04-01
  • 基金资助: 

CS-Softmax: A Cosine Similarity-Based Softmax Loss Function

Zhang Qiang1, Yang Jibin2, Zhang Xiongwei2, Cao Tieyong2, Zheng Changyan3   

  1. 1(Graduate School, Army Engineering University, Nanjing 210007);2(School of Command and Control Engineering, Army Engineering University, Nanjing 210007);3(High-Tech Institute, Qingzhou, Shandong 262500)
  • Online: 2022-04-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61602031), the Fundamental Research Funds for the Central Universities (FRF-BD-19-012A, FRF-IDRY-19-023), and the National Key 

摘要: 卷积神经网络分类框架广泛使用了基于Softmax函数的交叉熵损失(Softmax损失函数),在很多领域中都取得了良好的性能.但是由于Softmax损失函数并不鼓励增大类内紧凑性和类间分离性,在一些多分类问题中,卷积神经网络学习到的判别性嵌入表示的性能难以进一步提高.为了增强嵌入表示的判别性,提出了一种基于余弦相似性的Softmax(cosine similarity-based Softmax, CS-Softmax)损失函数.CS-Softmax损失函数在不改变神经网络结构的条件下,分别计算嵌入表示与分类全连接层权重的正相似性和负相似性,以实现同类紧凑和异类分离的训练目标.理论分析表明:边距因子、尺度因子、权重更新因子等参数的引入,可以调节各类别决策边距的大小,增大类内紧凑性、类间分离性,增强学习到的嵌入表示的判别性.在典型的音频、图像数据集上的仿真实验结果表明:CS-Softmax损失函数在不增加计算复杂度的同时,可以有效提升多分类任务性能,在MNIST,CIFAR10,CIFAR100图像分类任务中分别取得了99.81%,95.46%,76.46%的分类精度.

关键词: 模式分类, 卷积神经网络, 损失函数, Softmax, 余弦相似性

Abstract: Convolutional neural networks (CNNs)-based classification framework has achieved significant effects in pattern classification tasks, where the Softmax function with the cross-entropy loss (Softmax loss) can make CNNs learn separable embeddings. However, for some multi-classification problems, training with Softmax loss does not encourage increasing intra-class compactness and inter-class separability, which means it hardly generates the embedding with strong discriminability, making it hard to improve the performance further. In order to enhance the discriminability of learned embeddings, a cosine similarity-based Softmax (CS-Softmax) loss function is proposed. Without changing the network structure, the CS-Softmax loss introduces some parameters such as margin factor, scale factor and weight update factor to calculate the positive similarity and negative similarity between embeddings and different class weights based on the Softmax loss, so as to achieve the objectives of enhancing intra-class compactness and inter-class separability. Furthermore, the size of classification decision margin can be modified flexibly. These characteristics further enhance the discriminability of learned embeddings in CNNs. Classification experimental results on typical audio and image datasets show that the CS-Softmax loss can effectively improve the classification performance without increasing the computational complexity. The classification accuracies of the proposed loss are 99.81%, 95.46%, and 76.46% on the MNIST, CIFAR10, and CIFAR100 classification tasks, respectively.

Key words: pattern classification, convolutional neural networks (CNNs), loss function, Softmax, cosine similarity