Zhang Qiang, Yang Jibin, Zhang Xiongwei, Cao Tieyong, Zheng Changyan. CS-Softmax: A Cosine Similarity-Based Softmax Loss Function[J]. Journal of Computer Research and Development, 2022, 59(4): 936-949. DOI: 10.7544/issn1000-1239.20200879
Citation:
Zhang Qiang, Yang Jibin, Zhang Xiongwei, Cao Tieyong, Zheng Changyan. CS-Softmax: A Cosine Similarity-Based Softmax Loss Function[J]. Journal of Computer Research and Development, 2022, 59(4): 936-949. DOI: 10.7544/issn1000-1239.20200879
Zhang Qiang, Yang Jibin, Zhang Xiongwei, Cao Tieyong, Zheng Changyan. CS-Softmax: A Cosine Similarity-Based Softmax Loss Function[J]. Journal of Computer Research and Development, 2022, 59(4): 936-949. DOI: 10.7544/issn1000-1239.20200879
Citation:
Zhang Qiang, Yang Jibin, Zhang Xiongwei, Cao Tieyong, Zheng Changyan. CS-Softmax: A Cosine Similarity-Based Softmax Loss Function[J]. Journal of Computer Research and Development, 2022, 59(4): 936-949. DOI: 10.7544/issn1000-1239.20200879
1(Graduate School, Army Engineering University, Nanjing 210007)
2(School of Command and Control Engineering, Army Engineering University, Nanjing 210007)
3(High-Tech Institute, Qingzhou, Shandong 262500)
Funds: This work was supported by the National Natural Science Foundation of China (61602031), the Fundamental Research Funds for the Central Universities (FRF-BD-19-012A, FRF-IDRY-19-023), and the National Key
Convolutional neural networks (CNNs)-based classification framework has achieved significant effects in pattern classification tasks, where the Softmax function with the cross-entropy loss (Softmax loss) can make CNNs learn separable embeddings. However, for some multi-classification problems, training with Softmax loss does not encourage increasing intra-class compactness and inter-class separability, which means it hardly generates the embedding with strong discriminability, making it hard to improve the performance further. In order to enhance the discriminability of learned embeddings, a cosine similarity-based Softmax (CS-Softmax) loss function is proposed. Without changing the network structure, the CS-Softmax loss introduces some parameters such as margin factor, scale factor and weight update factor to calculate the positive similarity and negative similarity between embeddings and different class weights based on the Softmax loss, so as to achieve the objectives of enhancing intra-class compactness and inter-class separability. Furthermore, the size of classification decision margin can be modified flexibly. These characteristics further enhance the discriminability of learned embeddings in CNNs. Classification experimental results on typical audio and image datasets show that the CS-Softmax loss can effectively improve the classification performance without increasing the computational complexity. The classification accuracies of the proposed loss are 99.81%, 95.46%, and 76.46% on the MNIST, CIFAR10, and CIFAR100 classification tasks, respectively.