CS-Softmax: A Cosine Similarity-Based Softmax Loss Function

Zhang Qiang; Yang Jibin; Zhang Xiongwei; Cao Tieyong; Zheng Changyan

doi:10.7544/issn1000-1239.20200879

Journal of Computer Research and Development > 2022 > 59(4): 936-949. > DOI: 10.7544/issn1000-1239.20200879 CSTR: 32373.14.issn1000-1239.20200879

Zhang Qiang, Yang Jibin, Zhang Xiongwei, Cao Tieyong, Zheng Changyan. CS-Softmax: A Cosine Similarity-Based Softmax Loss Function[J]. Journal of Computer Research and Development, 2022, 59(4): 936-949. DOI: 10.7544/issn1000-1239.20200879

Citation:

PDF (3574 KB)

CS-Softmax: A Cosine Similarity-Based Softmax Loss Function

¹(Graduate School, Army Engineering University, Nanjing 210007)
²(School of Command and Control Engineering, Army Engineering University, Nanjing 210007)
³(High-Tech Institute, Qingzhou, Shandong 262500)

Funds: This work was supported by the National Natural Science Foundation of China (61602031), the Fundamental Research Funds for the Central Universities (FRF-BD-19-012A, FRF-IDRY-19-023), and the National Key

More Information

Published Date: March 31, 2022

Graphical Abstract

Abstract

Abstract

Convolutional neural networks (CNNs)-based classification framework has achieved significant effects in pattern classification tasks, where the Softmax function with the cross-entropy loss (Softmax loss) can make CNNs learn separable embeddings. However, for some multi-classification problems, training with Softmax loss does not encourage increasing intra-class compactness and inter-class separability, which means it hardly generates the embedding with strong discriminability, making it hard to improve the performance further. In order to enhance the discriminability of learned embeddings, a cosine similarity-based Softmax (CS-Softmax) loss function is proposed. Without changing the network structure, the CS-Softmax loss introduces some parameters such as margin factor, scale factor and weight update factor to calculate the positive similarity and negative similarity between embeddings and different class weights based on the Softmax loss, so as to achieve the objectives of enhancing intra-class compactness and inter-class separability. Furthermore, the size of classification decision margin can be modified flexibly. These characteristics further enhance the discriminability of learned embeddings in CNNs. Classification experimental results on typical audio and image datasets show that the CS-Softmax loss can effectively improve the classification performance without increasing the computational complexity. The classification accuracies of the proposed loss are 99.81%, 95.46%, and 76.46% on the MNIST, CIFAR10, and CIFAR100 classification tasks, respectively.
- pattern classification,
- convolutional neural networks (CNNs),
- loss function,
- Softmax,
- cosine similarity

FullText(HTML)

References (0)

[1]	Hao Shaopu, Liu Quan, Xu Ping’an, Zhang Lihua, Huang Zhigang. Multi-Modal Imitation Learning Method with Cosine Similarity[J]. Journal of Computer Research and Development, 2023, 60(6): 1358-1372. DOI: 10.7544/issn1000-1239.202220119
[2]	Zhang Zhenguo, Wang Chao, Wen Yanlong, Yuan Xiaojie. Time Series Shapelets Extraction via Similarity Join[J]. Journal of Computer Research and Development, 2019, 56(3): 594-610. DOI: 10.7544/issn1000-1239.2019.20170741
[3]	Wang Jinbao, Gao Hong, Li Jianzhong, Yang Donghua. Processing String Similarity Search in External Memory Efficiently[J]. Journal of Computer Research and Development, 2015, 52(3): 738-748. DOI: 10.7544/issn1000-1239.2015.20130683
[4]	Yuan Weiguo, Liu Yun. Growth Law of User Characteristics in Microblog[J]. Journal of Computer Research and Development, 2015, 52(2): 522-532. DOI: 10.7544/issn1000-1239.2015.20131273
[5]	Wu Honghua, Liu Guohua, Wang Wei. Similarity Matching for Uncertain Time Series[J]. Journal of Computer Research and Development, 2014, 51(8): 1802-1810. DOI: 10.7544/issn1000-1239.2014.20121055
[6]	Hong Jiaming, Yin Jian, Huang Yun, Liu Yubao, and Wang Jiahai. TrSVM: A Transfer Learning Algorithm Using Domain Similarity[J]. Journal of Computer Research and Development, 2011, 48(10): 1823-1830.
[7]	Yu Jian. Binary Representation of Similarity[J]. Journal of Computer Research and Development, 2010, 47(12).
[8]	Chen Qiang, Zheng Yuhui, Sun Quansen, Xia Deshen. Patch Similarity Based Anisotropic Diffusion for Image Denoising[J]. Journal of Computer Research and Development, 2010, 47(1): 33-42.
[9]	Zhang Dongmei and Liu Ligang. Planar Shape Blending Algorithm with Preserving Interior Similarity[J]. Journal of Computer Research and Development, 2007, 44(11): 1932-1938.
[10]	Xiu Yu, Wang Shitong, Wu Xisheng, Hu Dewen. The Directional Similarity-Based Clustering Method DSCM[J]. Journal of Computer Research and Development, 2006, 43(8): 1425-1431.

Cited By

Cited by

Periodical cited type(16)

1.	阚东，邬潇莹，白杨. 基于SIR-TOPSIS的作战体系架构方案评价方法. 自动化应用. 2025(02): 76-80 .
2.	章蕾，张执. 基于虚拟现实技术的康复训练系统设计与效果评估研究. 计算机测量与控制. 2025(02): 317-324 .
3.	公确多杰，索南才让，才藏太. 融合词典的BERT-BiGRU的藏语句子情感分类方法. 计算机工程与设计. 2025(03): 918-926 .
4.	张琪东，迟静，陈玉妍，张彩明. 基于雾浓度分类与暗-亮通道先验的多分支去雾网络. 计算机研究与发展. 2024(03): 762-779 . 本站查看
5.	刘军. 一种选煤厂煤泥压滤自动控制方法的设计与实现. 液压气动与密封. 2024(06): 109-114 .
6.	李游，毛文奇，李国栋，周云雅. 基于全卷积神经网络的无人机巡检图像边缘检测方法. 微型电脑应用. 2024(06): 91-95+108 .
7.	左丽娜，刘小贞，李伟杰，何首武. 多用户源头无线传感网络不完整数据挖掘算法. 传感技术学报. 2024(08): 1454-1459 .
8.	薛永建，刘高文，马佳乐，白杨，龚文彬，林阿强. 涡轮发动机供气系统流量和压力的控制方案. 航空动力学报. 2024(10): 478-489 .
9.	张浩鸣，周煊超，阿那尔. 基于改进SimCNN模型的矿山地震灾害识别研究. 能源与环保. 2024(10): 27-33 .
10.	武新章，赵子巍，代伟，谢代钰，郭苏杭，王泽宇，张冬冬. 基于改进的Transformer神经网络辅助的两阶段机组组合决策方法. 电力自动化设备. 2023(03): 172-179 .
11.	陈静，王晓轩，吴宇静，王蓉蓉. 基于CNN的零样本城市遥感影像场景分割算法. 吉林大学学报(信息科学版). 2023(04): 739-745 .
12.	张朝刚，侍中楼，李敏. 基于多状态时间序列预测学习的超精密机床主轴故障诊断仿真. 吉林大学学报(工学版). 2023(11): 3056-3061 .
13.	张强，杨吉斌，张雄伟，曹铁勇，李毅豪. 基于GAN实现环境声音分类的组合对抗防御. 电子与信息学报. 2023(12): 4399-4410 .
14.	马迪迪，赵静，林亚龙，王婧雯. 一种轨旁设备可靠性度量方法的设计与实现. 环境技术. 2023(12): 24-30 .
15.	盛江明，薛娟，李鹏，伊娜. 基于时空图卷积神经网络的蛋白质复合物识别方法. 南方医科大学学报. 2022(07): 1075-1081 .
16.	徐敏，王平. SPGAP-ResLSTMnet下的旋转机械故障诊断研究. 制造技术与机床. 2022(09): 20-26 .