Interpretable Salary Prediction Algorithm Based on Set Utility Marginal Contribution Learning
-
摘要:
知识技能对薪酬影响作用视为一种多变量影响下高维元素集合的效用建模问题. 深度神经网络为解决复杂问题提供了新的机遇,但针对知识导向的细粒度薪酬预测问题,仍缺乏能够对复杂变量影响下的集合效用进行准确、可解释建模的神经网络结构. 为此,提出一种基于边际贡献的增量式集合效用网络 (marginal contribution-based incremental set utility network,MCISUN)来拟合元素加入时的效用增量,从而灵活且可解释地建模集合效用. 区别于以往基于池化层的排列不变性建模算法,MCISUN构建顺序敏感的中间结果,利用集合的排列不变性实现数据增强,有效提升模型数据效率及泛化性. 最后,大规模真实薪酬数据上的实验结果表明所提模型在基于技能的薪酬预测任务上比最先进的(state-of-the-art, SOTA)模型效果提升超过30%. 同时,定性实验证明模型能够为技能设置合理的贡献值且发现技能间的关联.
Abstract:Accurately quantifying the relationship between skills and salary is essential to improve reasonable job salary setting and promote talent attraction and retention. However, the relationship between skills and salary is complex because it involves modeling set utility in a high-dimensional space with massive possible elements. Deep neural networks offer a new solution for complex fitting problems. However, for skill-based fine-grained salary prediction, there still lacks interpretable neural networks that can effectively model set utility under the influence of complex variables. To address this issue, we propose a marginal contribution-based incremental set utility network (MCISUN). MCISUN models the marginal contribution of elements when they are added to the set. In this way, the set utility can be naturally obtained in a flexible and interpretable way. In particular, rather than relying on pooling structures to ensure permutation invariance, MCISUN constructs order-sensitive intermediate results through recurrent attention neural networks and takes advantage of the sets’ permutation invariance property to achieve data augmentation, thus improving the model’s robustness. We conduct extensive experiments on a real-world large-scale salary dataset. The experimental results show that MCISUN outperforms state-of-the-art models by 30% for skill-based salary prediction. Qualitative experiments show that our model can recognize reasonable skill contribution values and capture the relationship between skills.
-
Keywords:
- set utility modeling /
- marginal contribution /
- salary prediction /
- neural network /
- interpretability
-
-
表 1 超参数设置
Table 1 Hyper-Parameter Configuration
参数 参数值 参数 参数值 嵌入大小 128 LSTM 神经元个数 1024 MLP层数 3 MLP隐藏单元 128 注意力头数 16 注意力层维度 64 表 2 IT数据集上薪酬预测误差
Table 2 Salary Prediction Errors on IT Dataset
模型 薪酬下限 薪酬上限 RMSE MAE RMSE MAE SVM 5.675±0.215 4.120±0.028 10.404±1.202 7.177±0.038 LR 5.386±0.021 4.033±0.013 9.545±0.049 7.139±0.028 GBDT 4.878±0.023 3.651±0.017 8.763±0.032 6.568±0.027 DNN 6.498±0.031 4.999±0.036 11.801±0.021 9.460±0.020 HSBMF 5.291±0.017 3.939±0.015 9.188±0.036 6.800±0.028 TextCNN 4.999±0.028 3.712±0.018 8.800±0.057 6.554±0.057 HAN 4.761±0.043 3.497±0.054 8.333±0.069 6.111±0.092 Transformer-XL 5.459±0.016 4.097±0.045 9.663±0.061 7.278±0.074 BERT 4.592±0.010 3.331±0.011 8.110±0.136 5.841±0.137 RoBERTa 4.642±0.014 3.377±0.011 8.400±0.076 6.122±0.058 XLNet 4.566±0.015 3.333±0.011 8.254±0.060 5.995±0.044 SSCN 4.435±0.061 3.244±0.048 7.686±0.086 5.627±0.060 MCISUN(DeepSet)
(本文)3.439±0.018 2.413±0.015 5.909±0.036 4.193±0.028 MCISUN (w/o l)
(本文)4.336±0.096 3.187±0.092 7.172±0.070 5.273±0.057 MCISUN (w/o a)
(本文)3.243±0.015 2.148±0.014 5.640±0.028 3.778±0.019 MCISUN(本文) 3.169±0.017 2.118±0.012 5.505±0.025 3.718±0.022 注:黑体表示最低误差. 表 3 Designer数据集上薪酬预测误差
Table 3 Salary Prediction Errors on Designer Dataset
模型 薪酬下限 薪酬上限 RMSE MAE RMSE MAE SVM 4.271±0.067 3.137±0.030 7.361±0.101 5.441±0.050 LR 4.183±0.053 3.089±0.029 7.343±0.131 5.436±0.075 GBDT 3.534±0.066 2.585±0.035 6.295±0.110 4.657±0.068 DNN 5.181±0.039 4.117±0.039 9.209±0.107 7.307±0.065 HSBMF 4.587±0.086 3.347±0.036 7.874±0.095 5.814±0.074 TextCNN 4.282±0.148 3.151±0.064 8.800±0.057 5.542±0.107 HAN 4.032±0.123 2.983±0.120 7.126±0.189 5.308±0.139 Transformer-XL 5.075±0.124 3.909±0.132 9.141±0.379 7.151±0.336 BERT 3.797±0.044 2.807±0.027 10.646±0.109 8.343±0.131 RoBERTa 4.272±0.142 3.136±0.075 9.187±0.389 7.522±0.622 XLNet 3.852±0.069 2.864±0.037 4.498±0.009 3.312±0.014 SSCN 3.316±0.036 2.408±0.025 5.887±0.139 4.294±0.107 MCISUN(DeepSet)
(本文)2.604±0.031 1.765±0.030 4.473±0.066 3.110±0.056 MCISUN(w/o l)
(本文)2.939±0.025 2.047±0.024 5.477±0.064 3.738±0.037 MCISUN (w/o a)
(本文)2.657±0.024 1.791±0.017 4.353±0.020 2.940±0.017 MCISUN
(本文)2.521±0.020 1.639±0.012 4.170±0.025 2.784±0.019 注:黑体表示最低误差. 表 4 对不同编程技能影响最大的前置技能
Table 4 Prerequisite Skills That Have the Greatest Impact on the Different Skills
编程技能 Top-5 Python R语言、数据分析、 数学、 数据仓库、统计 C++ IOS、 Android、 客户端、 数学、C语言 Java 项目管理、 Android、 推荐系统、 IOS、 大型软件 表 5 案例岗位内容
Table 5 A Sample Job Post Content
内容条目 内容明细 发布时间 2018年10月 薪酬范围 1.5~3.0万元 工作地点 北京 技能集 Python、 编程、 编译、 C、 数据结构、
机器学习、 Java、 NLP、算法、 C++ -
[1] Hamlen K R, Hamlen W A. Faculty salary as a predictor of student outgoing salaries from MBA programs[J]. Journal of Education for Business, 2016, 91(1): 38−44 doi: 10.1080/08832323.2015.1110552
[2] Khongchai P, Songmuang P. Implement of salary prediction system to improve student motivation using data mining technique[C/OL]//Proc of the 11th Int Conf on Knowledge, Information and Creativity Support Systems (KICSS). Piscataway, NJ: IEEE, 2016[2023-06-25].https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7951419
[3] Khongchai P, Songmuang P. Random forest for salary prediction system to improve students’ motivation[C]//Proc of the 12th Int Conf on Signal-Image Technology and Internet-Based Systems (SITIS). Piscataway, NJ: IEEE, 2016: 637−642
[4] Bansal U, Narang A, Sachdeva A, et al. Empirical analysis of regression techniques by house price and salary prediction[C/OL]// Proc of the IOP Conf Series: Materials Science and Engineering. 2021[2023-06-25].https://iopscience.iop.org/article/10.1088/1757-899X/1022/1/012110/pdf
[5] 马新宇,范意兴,郭嘉丰,等. 关于短文本匹配的泛化性和迁移性的研究分析[J]. 计算机研究与发展,2022,59(1):118−126 Ma Xinyu, Fan Yixing, Guo Jiafeng, et al. An empirical investigaion of generalization and transfer in short text matching[J]. Journal of Computer Research and Development, 2022, 59(1): 118−126 (in Chinese)
[6] 潘博,张青川,于重重,等. Doc2vec 在薪水预测中的应用研究[J]. 计算机应用研究,2018,35(1):155−157 doi: 10.3969/j.issn.1001-3695.2018.01.032 Pan Bo, Zhang Qingchuan, Yu Chongchong, et al. Research on the application of Doc2vec in salary forecast[J]. Application Research of Computers, 2018, 35(1): 155−157 (in Chinese) doi: 10.3969/j.issn.1001-3695.2018.01.032
[7] More A, Naik A, Rathod S. Predict-nation skills based salary prediction for freshers[C/OL]//Proc of the 4th Int Conf on Advances in Science & Technology (ICAST2021). Berlin: Springer, 2021[2023-06-25].https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3866758
[8] Martín I, Mariello A, Battiti R, et al. Salary prediction in the IT job market with few high-dimensional samples: A spanish case study[J]. International Journal of Computational Intelligence Systems, 2018, 11(1): 1192−1209 doi: 10.2991/ijcis.11.1.90
[9] Sun Ying, Zhuang Fuzhen, Zhu Hengshu, et al. Market-oriented job skill valuation with cooperative composition neural network[J]. Nature Communications, 2021, 12(1): 1−12 doi: 10.1038/s41467-020-20314-w
[10] Zaheer M, Kottur S, Ravanbakhsh S, et al. Deep sets[C]//Advances in Neural Information Processing Systems 30. Cambridge, MA: MIT, 2017[2023-06-25].https://proceedings.neurips.cc/paper/2017/file/f22e4747da1aa27e363d86d40ff442fe-Paper.pdf
[11] Vinyals O, Bengio S, Kudlur M. Order matters: Sequence to sequence for sets[J]. arXiv preprint, arXiv: 1511.06391, 2015
[12] Lee J, Lee Y, Kim J, et al. Set Transformer: A framework for attention-based permutation-invariant neural networks[C]// Proc of the 36th Int Conf on Machine Learning. New York: ACM, 2019: 3744−3753
[13] Zhang Yan, Hare J, Prügel-Bennett A. FSPool: Learning set representations with featurewise sort pooling[C/OL]//Proc of the 8th Int Conf on Learning Representations. 2020[2023-06-25].https://openreview.net/forum?id=HJgBA2VYwH
[14] Murphy R L, Srinivasan B, Rao V, et al. Janossy Pooling: Learning deep permutation-invariant functions for variable-size inputs[C/OL]//Proc of the 8th Int Conf on Learning Representations. 2020[2023-06-25].https://openreview.net/forum?id=BJluy2RcFm
[15] Yang Bo, Wang Sen, Markham A, et al. Robust attentional aggregation of deep feature sets for multi-view 3D reconstruction[J]. International Journal of Computer Vision, 2020, 128(1): 53−73
[16] Saito Y, Nakamura T, Hachiya H, et al. Exchangeable deep neural networks for set-to-set matching and learning[C]//Proc of the 17th European Conf on Computer Vision. Berlin: Springer, 2020: 626−646
[17] Zhang Yan, Hare J, Prügel-Bennett A. Learning representations of sets through optimized permutations[C/OL]//Proc of the 7th Int Conf on Learning Representations. 2019[2023-06-25].https://openreview.net/forum?id=HJMCcjAcYX
[18] Blankmeyer E, LeSage J P, Stutzman J R, et al. Peer ‐ group dependence in salary benchmarking: A statistical model[J]. Managerial and Decision Economics, 2011, 32(2): 91−104
[19] Kenthapadi K, Ambler S, Zhang Liang, et al. Bringing salary transparency to the world: Computing robust compensation insights via LinkedIn Salary[C]//Proc of the 26th ACM on Conf on Information and Knowledge Management. New York: ACM, 2017: 447−455
[20] 张浩宇. 基于文本相似度与协同过滤的岗位薪资预测[D]. 广州:中南财经政法大学,2018 Zhang Haoyu. Job salary prediction based on text similarity and collaborative filtering[D]. Guangzhou: Zhongnan University of Economics and Law, 2018 (in Chinese)
[21] Meng Qingxin, Xiao Keli, Shen Dazhong, et al. Fine-grained job salary benchmarking with a nonparametric Dirichlet process–based latent factor model[J]. INFORMS Journal on Computing, 2022, 34(5): 2443−2463 doi: 10.1287/ijoc.2022.1182
[22] Meng Qingxin, Zhu Hengshu, Xiao Keli, et al. Intelligent salary benchmarking for talent recruitment: A holistic matrix factorization approach[C]//Proc of the 2018 IEEE Int Conf on Data Mining (ICDM). Piscataway, NJ: IEEE, 2018: 337−346
[23] Wang Zhongsheng, Sugaya S, Nguyen D P T. Salary prediction using bidirectional-GRU-CNN model[C/OL]//Proc of the 25th Annual Meeting of the Association for Natural Language Processing. 2019[2023-06-25].https://www.anlp.jp/proceedings/annual_meeting/2019/pdf_dir/F3-1.pdf
[24] Guo Huifeng, Tang Ruiming, Ye Yunming, et al. DeepFM: A factorization-machine based neural network for CTR prediction [C]//Proc of the 26th Int Joint Conf on Artificial Intelligence. San Francisco, CA: Morgan Kaufmann, 2017: 1725−1731
[25] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735−1780 doi: 10.1162/neco.1997.9.8.1735
[26] Sun Ying, Zhuang Fuzhen, Zhu Hengshu, et al. Job posting data[CP/OL]. 2021[2023-06-25].https://figshare.com/articles/dataset/Job_Posting_Data/14060498/
[27] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C/OL]//Proc of the 30th Int Conf on Artificial Intelligence and Statistics. New York: ACM, 2010[2023-06-25]. http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
[28] Kingma D P, Ba J. Adam: A method for stochastic optimization[C/OL]//Proc of the 3rd Int Conf on Learning Representations (Poster). 2015[2023-06-25].https://iclr.cc/archive/www/doku.php%3Fid=iclr2015:accepted-main.html
[29] Xu Bing, Wang Naiyan, Chen Tianqi, et al. Empirical evaluation of rectified activations in convolutional network[J]. arXiv preprint, arXiv: 1505.00853, 2015
[30] Noble W S. What is a support vector machine?[J]. Nature Biotechnology, 2006, 24(12): 1565−1567 doi: 10.1038/nbt1206-1565
[31] Montgomery D C, Peck E A, Vining G G. Introduction to Linear Regression Analysis[M]. Hoboken: John Wiley & Sons, 2021
[32] Mason L, Baxter J, Bartlett P, et al. Boosting algorithms as gradient descent[C/OL]//Advances in Neural Information Processing Systems 12. Cambridge, MA: MIT, 1999[2023-06-25].https://proceedings.neurips.cc/paper/1999/file/96a93ba89a5b5c6c226e49b88973f46e-Paper.pdf
[33] Gardner M W, Dorling S R. Artificial neural networks (the multilayer perceptron)—A review of applications in the atmospheric sciences[J]. Atmospheric Environment, 1998, 32(14/15): 2627−2636
[34] Chen Yahui. Convolutional neural network for sentence classification[D]. Waterloo: University of Waterloo, 2015
[35] Zhang Xiang, Zhao Junbo, LeCun Y. Character-level convolutional networks for text classification[C/OL]//Advances in Neural Information Processing Systems 28. Cambridge, MA: MIT, 2015[2023-06-25]. https://proceedings.neurips.cc/paper/2015/file/250cf8b51c773f3f8dc8b4be867a9a02-Paper.pdf
[36] Yang Zichao, Yang Diyi, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proc of the 15th North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2016: 1480−1489
[37] Dai Zihang, Yang Zhilin, Yang Yiming, et al. Transformer-Xl: Attentive language models beyond a fixed-length context[C/OL]//Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019[2023-06-25].https://arxiv.org/pdf/1901.02860.pdf%3Ffbclid%3DIwAR3nwzQA7VyD36J6u8nEOatG0CeW4FwEU_upvvrgXSES1f0Kd-
[38] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proc of the 17th Annual Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2019: 4171−4186
[39] Liu Yinhan, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach[J]. arXiv preprint, arXiv: 1907.11692, 2019
[40] Yang Zhilin, Dai Zihang, Yang Yiming, et al. XLNet: Generalized autoregressive pretraining for language understanding[C/OL]//Advances in Neural Information Processing Systems 32. Cambridge, MA: MIT, 2019[2023-06-25].https://proceedings.neurips.cc/paper/2019/file/dc6a7e655d7e5840e66733e9ee67cc69-Paper.pdf
[41] Zhang Yan, Hare J, Prugel-Bennett A. Deep set prediction networks[C/OL]//Advances in Neural Information Processing Systems 32. Cambridge, MA: MIT, 2019 [2024-03-29]. https://proceedings.neurips.cc/paper_files/paper/2019/file/6e79ed05baec2754e25b4eac73a332d2-Paper.pdf
[42] Botchkarev A. A new typology design of performance metrics to measure errors in machine learning regression algorithms[J]. Interdisciplinary Journal of Information, Knowledge, and Management, 2019, 14: 45−79
[43] Blum A, Kalai A, Langford J. Beating the hold-out: Bounds for k-fold and progressive cross-validation[C]//Proc of the 12th Annual Conf on Computational Learning Theory. New York: ACM, 1999: 203−208
-
期刊类型引用(7)
1. 张淑芬,张宏扬,任志强,陈学斌. 联邦学习的公平性综述. 计算机应用. 2025(01): 1-14 . 百度学术
2. 朱智韬,司世景,王健宗,程宁,孔令炜,黄章成,肖京. 联邦学习的公平性研究综述. 大数据. 2024(01): 62-85 . 百度学术
3. 李锦辉,吴毓峰,余涛,潘振宁. 数据孤岛下基于联邦学习的用户电价响应刻画及其应用. 电力系统保护与控制. 2024(06): 164-176 . 百度学术
4. 刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法. 山东大学学报(工学版). 2024(03): 55-63 . 百度学术
5. 赵泽华,梁美玉,薛哲,李昂,张珉. 基于数据质量评估的高效强化联邦学习节点动态采样优化. 智能系统学报. 2024(06): 1552-1561 . 百度学术
6. 杨秀清,彭长根,刘海,丁红发,汤寒林. 基于数据质量评估的公平联邦学习方案. 计算机与数字工程. 2022(06): 1278-1285 . 百度学术
7. 黎志鹏. 高可靠的联邦学习在图神经网络上的聚合方法. 工业控制计算机. 2022(10): 85-87+90 . 百度学术
其他类型引用(10)