Citation: | Yi Xiaoyuan, Xie Xing. Unpacking the Ethical Value Alignment in Big Models[J]. Journal of Computer Research and Development, 2023, 60(9): 1926-1945. DOI: 10.7544/issn1000-1239.202330553 |
We explore the emerging challenges presented by artificial intelligence (AI) development in the era of big models, with a focus on large language model (LLM) and ethical value alignment. Big models have greatly advanced AI’s ability to understand, generate, and manipulate information and content, enabling numerous applications. However, as these models become increasingly integrated into everyday life, their inherent ethical values and potential biases pose unforeseen risks to society. We provide an overview of the risks and challenges associated with big models, survey existing AI ethics guidelines, and examine the ethical implications arising from the limitations of these models. Taking a normative ethics perspective, we propose a reassessment of recent normative guidelines, highlighting the importance of collaborative efforts in academia to establish a unified and universal AI ethics framework. Furthermore, we investigate the ethical inclinations of current mainstream large language models using moral foundation theory, analyze existing big model alignment algorithms, and outline the unique challenges encountered in aligning moral values within them. To address these challenges, we introduce a novel conceptual paradigm for ethically aligning the values of big models and discuss promising research directions for alignment criteria, evaluation and method, representing an initial step towards the interdisciplinary construction of a morally aligned general artificial intelligence.
[1] |
Bommasani R, Hudson D A, Adeli E, et al. On the opportunities and risks of foundation models[J]. arXiv preprint, arXiv: 2108.07258, 2021
|
[2] |
Brown T, Mann B, Ryder N, et al. Language models are few-shot learners[C]//Advances in Neural Information Processing Systems. San Diego: Neural Information Processing Systems Foundation Inc, 2020, 33: 1877−1901
|
[3] |
Ouyang L, Wu J, Jiang X, et al. Training language models to follow instructions with human feedback[C]//Advances in Neural Information Processing Systems. San Diego: Neural Information Processing Systems Foundation Inc, 2022, 35: 27730−27744
|
[4] |
OpenAI. GPT-4 technical report[J]. arXiv preprint, arXiv: 2303.08774, 2023
|
[5] |
Narang S, Chowdhery A. Pathways language model (PALM): Scaling to 540 billion parameters for breakthrough performance[EB/OL]. (2022-04-04) [2023-06-30].https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
|
[6] |
Aydın Ö. Google bard generated literature review: Metaverse[J]. Journal of AI, 2023, 7(1): 1−14
|
[7] |
Touvron H, Lavril T, Izacard G, et al. LLaMA: Open and efficient foundation language models[J]. arXiv preprint, arXiv: 2302.13971, 2023
|
[8] |
Ramesh A, Dhariwal P, Nichol A, et al. Hierarchical text-conditional image generation with clip latents[J]. arXiv preprint, arXiv: 2204.06125, 2022
|
[9] |
Driess D, Xia F, Sajjadi M S, et al. Palm-e: An embodied multimodal language model[J]. arXiv preprint, arXiv: 2303.03378, 2023
|
[10] |
卢志武,金琴,宋睿华,等. 悟道·文澜:超大规模多模态预训练模型带来了什么?[J]. 中兴通讯技术,2022,28(2):25−32 doi: 10.12142/ZTETJ.202204006
Lu Zhiwu, Jin Qin, Song Ruihua, et al. Wudao: wenlan: What do very-large multimodal pre-training models bring?[J]. ZTE Communications, 2022, 28(2): 25−32 (in Chinese) doi: 10.12142/ZTETJ.202204006
|
[11] |
Pauls A, Klein D. Faster and smaller n-gram language models[C]//Proc of the 49th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2011: 258−267
|
[12] |
Cho K, Van Merriënboer B, Gu̇lçehre Ç, et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation[C]//Proc of the 2014 Conf on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: ACL, 2014: 1724−1734
|
[13] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint, arXiv: 1810.04805, 2018
|
[14] |
Kaplan J, McCandlish S, Henighan T, et al. Scaling laws for neural language models[J]. arXiv preprint, arXiv: 2001.08361, 2020
|
[15] |
Wei J, Tay Y, Bommasani R, et al. Emergent abilities of large language models[J]. arXiv preprint, arXiv: 2206.07682, 2022
|
[16] |
Hoffmann J, Borgeaud S, Mensch A, et al. Training compute-optimal large language models[J]. arXiv preprint, arXiv: 2203.15556, 2022
|
[17] |
Chowdhery A, Narang S, Devlin J, et al. Palm: Scaling language modeling with pathways[J]. arXiv preprint, arXiv: 2204.02311, 2022
|
[18] |
Wang H, Ma S, Dong L, et al. DeepNet: Scaling transformers to 1, 000 layers[J]. arXiv preprint, arXiv: 2203.00555, 2022
|
[19] |
Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[EB/OL]. (2018-06-11) [2023-06-30].https://openai.com/research/language-unsupervised
|
[20] |
Xie S M, Raghunathan A, Liang P, et al. An explanation of in-context learning as implicit Bayesian inference[J]. arXiv preprint, arXiv: 2111.02080, 2021
|
[21] |
Gabriel I. Artificial intelligence, values, and alignment[J]. Minds and Machines, 2020, 30(3): 411−437 doi: 10.1007/s11023-020-09539-2
|
[22] |
Min S, Lyu X, Holtzman A, et al. Rethinking the role of demonstrations: What makes in-context Learning work?[J]. arXiv preprint, arXiv: 2202.12837, 2022
|
[23] |
Chiang W, Li Z, Lin Z, et al. Vicuna: An open-source chatbot impressing GPT-4 with 90%* chatGPT quality[EB/OL]. (2023-05-30)[2023-06-30]. https: //vicuna. lmsys. org/
|
[24] |
Liang Y, Wu C, Song T, et al. Taskmatrix. AI: Completing tasks by connecting foundation models with millions of APIs[J]. arXiv preprint, arXiv: 2303.16434, 2023
|
[25] |
Eloundou T, Manning S, Mishkin P, et al. GPTs are GPTs: An early look at the labor market impact potential of large language models[J]. arXiv preprint, arXiv: 2303.10130, 2023
|
[26] |
Anderson S L. Asimov’s “three laws of robotics” and machine metaethics[J/OL]. AI Soc. , 2008, 22(4): 477−493
|
[27] |
Bar-El H, Choukri H, Naccache D, et al. The sorcerer’s apprentice guide to fault attacks[J]. Proceedings of the IEEE, 2006, 94(2): 370−382 doi: 10.1109/JPROC.2005.862424
|
[28] |
McKenzie I R, Lyzhov A, Pieler M, et al. Inverse scaling: When bigger isn’t better[J]. arXiv preprint, arXiv: 2306.09479. 2023
|
[29] |
滕妍,王国豫,王迎春. 通用模型的伦理与治理:挑战及对策[J]. 中国科学院院刊,2022,37(9):1290−1299
Teng Yan, Wang Guoyu, Wang Yingchun. Ethics and governance of general models: Challenges and countermeasures[J]. Bulletin of Chinese Academy of Sciences, 2022, 37(9): 1290−1299 (in Chinese)
|
[30] |
Yang Z, Yi X, Li P, et al. Unified detoxifying and debiasing in language generation via inference-time adaptive optimization[J]. arXiv preprint, arXiv: 2210.04492, 2022
|
[31] |
Sheng E, Chang K W, Natarajan P, et al. Societal biases in language generation: Progress and challenges[C]//Proc of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2021: 4275−4293
|
[32] |
Welbl J, Glaese A, Uesato J, et al. Challenges in detoxifying language models[J]. arXiv preprint, arXiv: 2109.07445, 2021
|
[33] |
Carlini N, Tramer F, Wallace E, et al. Extracting training data from large language models[C]//Proc of the 30th USENIX Security Symp (USENIX Security’21). Berkeley, CA: USENIX Association, 2021: 2633−2650
|
[34] |
Vyas N, Kakade S, Barak B. Provable copyright protection for generative models[J]. arXiv preprint, arXiv: 2302.10870, 2023
|
[35] |
Holtzman A, Buys J, Du L, et al. The curious case of neural text degeneration[J]. arXiv preprint, arXiv: 1904.09751, 2019
|
[36] |
Ji Z, Lee N, Frieske R, et al. Survey of hallucination in natural language generation[J]. ACM Computing Surveys, 2023, 55(12): 1−38
|
[37] |
Weidinger L, Mellor J, Rauh M, et al. , Ethical and social risks of harm from language models[J]. arXiv preprint, arXiv: 2112.04359, 2021
|
[38] |
吴砥,李环,陈旭. 人工智能通用大模型教育应用影响探析[J]. 开放教育研究,2023,29(2):19−25
Wu Di, Li Huan, Chen Xu. Analysis on the influence of artificial intelligence generic large model on education application[J]. Open Education Research, 2023, 29(2): 19−25 (in Chinese)
|
[39] |
Zarifhonarvar A. Economics of ChatGPT: A labor market view on the occupational impact of artificial intelligence[J]. Available at SSRN, 4350, 925: Article No.2023
|
[40] |
Dergaa I, Chamari K, Zmijewski P, et al. From human writing to artificial intelligence generated text: Examining the prospects and potential threats of ChatGPT in academic writing[J]. Biology of Sport, 2023, 40(2): 615−622 doi: 10.5114/biolsport.2023.125623
|
[41] |
Ferrara E. Should ChatGPT be biased? challenges and risks of bias in large language models[J]. arXiv preprint, arXiv: 2304.03738, 2023
|
[42] |
Parthemore J, Whitby B. What makes any agent a moral agent? Reflections on machine consciousness and moral agency[J]. International Journal of machine consciousness, 2013, 5(2): 105−129 doi: 10.1142/S1793843013500017
|
[43] |
Brożek B, Janik B. Can artificial intelligences be moral agents?[J]. New Ideas in Psychology, 2019, 54: 101−106 doi: 10.1016/j.newideapsych.2018.12.002
|
[44] |
Sullins J P. When is a robot a moral agent?[J]. International Review of Information Ethics, 2006, 6(12): 23-30
|
[45] |
Cervantes J A, López S, Rodríguez L F, et al. Artificial moral agents: A survey of the current status[J]. Science and Engineering Ethics, 2020, 26: 501−532 doi: 10.1007/s11948-019-00151-x
|
[46] |
Moor J. Four kinds of ethical robots[J]. Philosophy Now, 2009, 72: 12−14
|
[47] |
Schramowski P, Turan C, Andersen N, et al. Large pre-trained language models contain human-like biases of what is right and wrong to do[J]. Nature Machine Intelligence, 2022, 4(3): 258−268 doi: 10.1038/s42256-022-00458-8
|
[48] |
Simmons G. Moral mimicry: Large language models produce moral rationalizations tailored to political identity[J]. arXiv preprint, arXiv: 2209.12106, 2023
|
[49] |
Zhao W, Zhao Y, Lu X, et al. Is ChatGPT equipped with emotional dialogue capabilities?[J] arXiv preprint, arXiv: 2304.09582, 2023
|
[50] |
Rozado D. The political biases of ChatGPT[J]. Social Sciences, 2023, 12(3): 1−8
|
[51] |
Moghaddam S R, Honey C J. Boosting theory-of-mind performance in large language models via prompting[J]. arXiv preprint, arXiv: 2304.11490, 2023
|
[52] |
United Nations Educational, Scientific and Cultural Organization. Recommendation on the ethics of artificial intelligence[Z]. UNESCO France, 2021
|
[53] |
Holdren J P. Memorandum for the heads of executive departments and agencies: Increasing access to the results of federally funded scientific research [EB/OL]. (2022-08-25) [2023-06-30].https://www.whitehouse.gov/wp-content/uploads/2022/08/08−2022-OSTP-Public-access-Memo.pdf
|
[54] |
国家新一代人工智能治理专业委员会. 新一代人工智能伦理规范[EB/OL]. (2021-09-25) [2023-08-07].https://www.most.gov.cn/kjbgz/202109/t20210926_177063.html
The National New Generation Artificial Intelligence Governance Specialist Committee. Ethical Norms for New Generation Artificial Intelligence [EB/OL]. (2021-09-25) [2023-08-07].https://www.most.gov.cn/kjbgz/202109/t20210926_177063.html (in Chinese)
|
[55] |
Smuha N A. The EU approach to ethics guidelines for trustworthy artificial intelligence[J]. Computer Law Review International, 2019, 20(4): 97−106 doi: 10.9785/cri-2019-200402
|
[56] |
World Economic Forum. How to prevent discriminatory outcomes in machine learning [EB/OL]. (2018-03-12) [2023-06-30].https://www3.weforum.org/docs/WEF_40065_White_Paper_How_to_Prevent_Discriminatory_Outcomes_in_Machine_Learning.pdf
|
[57] |
Garbowski M. A critical analysis of the Asilomar AI principles[J]. Scientific Papers of Silesian University of Technology, 2018, 115: 45−55
|
[58] |
Fjeld J, Achten N, Hilligoss H, et al. Principled artificial intelligence: Mapping consensus in ethical and rights-based approaches to principles for AI [EB/OL]. (2022-02-10) [2023-06-30].https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3518482
|
[59] |
Floridi L, Cowls J. A unified framework of five principles for AI in society[G]//Ethics, Governance, and Policies in Artificial Intelligence. Berlin: Springer, 2021: 5−17
|
[60] |
Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guidelines[J]. Nature Machine Intelligence, 2019, 1(9): 389−399 doi: 10.1038/s42256-019-0088-2
|
[61] |
Kagan S. Normative Ethics[M]. Oxfordshire: Routledge, 2018
|
[62] |
Paton H J. The Categorical Imperative: A Study in Kant’s Moral Philosophy: Vol 1023[M]. Philadelphia, PA: University of Pennsylvania Press, 1971
|
[63] |
Gao C, Lan X, Lu Z, et al. S3: Social-network simulation system with large language model-empowered agents[J]. arXiv preprint, arXiv: 2307.14984, 2023
|
[64] |
Ziems C, Held W, Shaikh O, et al. Can large language models transform computational social science?[J]. arXiv preprint, arXiv: 2305.03514, 2023
|
[65] |
Ganguli D, Lovitt L, Kernion J, et al. Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned[J]. arXiv preprint, arXiv: 2209.07858, 2022
|
[66] |
Wang Y, Kordi Y, Mishra S, et al. Self-instruct: Aligning language model with self-generated instructions[J]. arXiv preprint, arXiv: 2212.10560, 2022
|
[67] |
Bubeck S, Chandrasekaran V, Eldan R, et al. Sparks of artificial general intelligence: Early experiments with GPT-4[J]. arXiv preprint, arXiv: 2303.12712, 2023
|
[68] |
Foote A, Nanda N, Kran E, et al. Neuron to graph: Interpreting language model neurons at scale[J]. arXiv preprint, arXiv: 2305.19911, 2023
|
[69] |
Singh C, Hsu A R, Antonello R, et al. Explaining black box text modules in natural language with language models[J]. arXiv preprint, arXiv: 2305.09863, 2023
|
[70] |
Schwartz S H. Basic human values: Theory, measurement, and applications[J]. Revue française de sociologie, 2007, 47(4): 929−968
|
[71] |
Liñán F, Fernandez-Serrano J. National culture, entrepreneurship and economic development: Different patterns across the European Union[J]. Small Business Economics, 2014, 42: 685−701 doi: 10.1007/s11187-013-9520-x
|
[72] |
Graham J, Haidt J, Koleva S, et al. Moral foundations theory: The Pragmatic Validity of Moral Pluralism[M]//Advances in experimental social psychology: Vol 47. Amsterdam: Elsevier, 2013
|
[73] |
Zapko-Willmes A, Schwartz S H, Richter J, et al. Basic value orientations and moral foundations: Convergent or discriminant constructs?[J]. Journal of Research in Personality, 2021, 92: 104099 doi: 10.1016/j.jrp.2021.104099
|
[74] |
Kivikangas J M, Fernández-Castilla B, Järvelä S, et al. Moral foundations and political orientation: Systematic review and meta-analysis[J]. Psychological Bulletin, 2021, 147(1): 55−94 doi: 10.1037/bul0000308
|
[75] |
Graham J, Nosek B A, Haidt J, et al. Mapping the moral domain[J]. Journal of Personality and Social Psychology, 2011, 101(2): 366−385 doi: 10.1037/a0021847
|
[76] |
Bai Y, Jones A, Ndousse K, et al. Training a helpful and harmless assistant with reinforcement learning from human feedback[J]. arXiv preprint, arXiv: 2204.05862, 2023
|
[77] |
Ganguli D, Askell A, Schiefer N, et al. The capacity for moral self-correction in large language models[J] arXiv preprint, arXiv: 2302.07459, 2023
|
[78] |
Russell S J. Artificial Intelligence: A Modern Approach[M]. London: Pearson Education, Inc. , 2010
|
[79] |
Wiener N. Some moral and technical consequences of automation: As machines learn they may develop unforeseen strategies at rates that baffle their programmers.[J]. Science, 1960, 131(3410): 1355−1358 doi: 10.1126/science.131.3410.1355
|
[80] |
Ngo R. The alignment problem from a deep learning perspective[J]. arXiv preprint, arXiv: 2209.00626, 2022
|
[81] |
Wolf Y, Wies N, Levine Y, et al. Fundamental limitations of alignment in large language models[J]. arXiv preprint, arXiv: 2304.11082, 2023
|
[82] |
Brown D S, Schneider J, Dragan A, et al. Value alignment verification[C]//Proc of Int Conf on Machine Learning. Brookline, MA: PMLR, 2021: 1105−1115
|
[83] |
Sheng E, Chang K W, Natarajan P, et al. Towards controllable biases in language generation[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA: ACL, 2020: 3239−3254
|
[84] |
Cheng P, Hao W, Yuan S, et al. Fairfil: Contrastive neural debiasing method for pretrained text encoders[J]. arXiv preprint, arXiv: 2103.06413, 2021
|
[85] |
Berg H, Hall S, Bhalgat Y, et al. A prompt array keeps the bias away: Debiasing vision-language models with adversarial learning[C]//Proc of the 2nd Conf of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th Int Joint Conf on Natural Language Processing. Stroudsburg, PA: ACL, 2022: 806−822
|
[86] |
Qian J, Dong L, Shen Y, et al. Controllable natural language generation with contrastive prefixes[C]//Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg, PA: ACL, 2022: 2912−2924
|
[87] |
Dathathri S, Madotto A, Lan J, et al. Plug and play language models: A simple approach to controlled text generation[C]//Proc of Int Conf on Learning Representations. New Orleans, LA: OpenReview, 2019: Article No. 351
|
[88] |
Yang K, Klein D. Fudge: Controlled text generation with future discriminators[C]//Proc of the 2021 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2021: 3511−3535
|
[89] |
Liu A, Sap M, Lu X, et al. Dexperts: Decoding-time controlled text generation with experts and anti-experts[C]//Proc of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2021: 6691−6706
|
[90] |
Schick T, Udupa S, Schütze H. Self-diagnosis and self-debiasing: A proposal for reducing corpus-based bias in NLP[J]. Transactions of the Association for Computational Linguistics, 2021, 9: 1408−1424 doi: 10.1162/tacl_a_00434
|
[91] |
Liang P P, Wu C, Morency L P, et al. Towards understanding and mitigating social biases in language models[C]//Proc of Int Conf on Machine Learning. Brookline, MA: PMLR, 2021: 6565−6576
|
[92] |
Chen F, Dou Z Y. Measuring and mitigating bias in vision-and-language models [EB/OL]. (2022-03-01) [2023-06-30].https://web.cs.ucla.edu/~fychen/debiasVL.pdf.
|
[93] |
Wang B, Ping W, Xiao C, et al. Exploring the limits of domain-adaptive training for detoxifying large-scale language models[C]//Advances in Neural Information Processing Systems. San Diego: Neural Information Processing Systems Foundation Inc, 2022, 35: 35811−35824
|
[94] |
Saunders W, Yeh C, Wu J, et al. Self-critiquing models for assisting human evaluators[J]. arXiv preprint, arXiv: 2206.05802, 2022
|
[95] |
Lu K, Mardziel P, Wu F, et al. Gender bias in neural natural language processing[G]//Logic, Language, and Security: Essays Dedicated to Andre Scedrov on the Occasion of His 65th Birthday. Berlin: Springer, 2020: 189−202
|
[96] |
Gehman S, Gururangan S, Sap M, et al. Realtoxicityprompts: Evaluating neural toxic degeneration in language models[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA: ACL, 2020: 3356−3369
|
[97] |
Sun Z, Shen Y, Zhou Q, et al. Principle-driven self-alignment of language models from scratch with minimal human supervision[J]. arXiv preprint, arXiv: 2305.03047, 2023
|
[98] |
Liu H, Sferrazza C, Abbeel P. Chain of hindsight aligns language models with feedback[J]. arXiv preprint, arXiv: 2302.02676, 2023
|
[99] |
Kim S, Bae S, Shin J, et al. Aligning large language models through synthetic feedback[J]. arXiv preprint, arXiv: 2305.13735, 2023
|
[100] |
Bai Y, Kadavath S, Kundu S, et al. Constitutional AI: Harmlessness from AI feedback[J]. arXiv preprint, arXiv: 2212.08073, 2022
|
[101] |
Wei J, Wang X, Schuurmans, et al. Chain of thought prompting elicits reasoning in large language models[J]. arXiv preprint, arXiv: 2201.11903, 2022
|
[102] |
Yuan Z, Yuan H, Tan C, et al. Rrhf: Rank responses to align language models with human feedback without tears[J]. arXiv preprint, arXiv: 2304.05302, 2023
|
[103] |
Go D, Korbak T, Kruszewski G, et al. Aligning language models with preferences through f-divergence minimization[J]. arXiv preprint, arXiv: 2302.08215, 2023
|
[104] |
Liu R, Yang R, Jia C, et al. Training socially aligned language models in simulated human society[J]. arXiv preprint, arXiv: 2305.16960, 2023
|
[105] |
Kirk H R, Vidgen B, Röttger P, et al. Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalized feedback[J]. arXiv preprint, arXiv: 2303.05453, 2023
|
[106] |
Kenton Z, Everitt T, Weidinger L, et al. Alignment of language agents. [J] arXiv preprint, arXiv: 2103.14659, 2021
|
[107] |
Graham J, Meindl P, Beall E, et al. Cultural differences in moral judgment and behavior, across and within societies[J]. Current Opinion in Psychology, 2016, 8: 125−130 doi: 10.1016/j.copsyc.2015.09.007
|
[108] |
Krebs D. The evolution of morality[G]//The Handbook of Evolutionary Psychology. Hoboken: John Wiley & Sons, Inc, 2015: 747−771
|
[109] |
Peter E, Liaschenko J. Perils of proximity: A spatiotemporal analysis of moral distress and moral ambiguity[J]. Nursing Inquiry, 2004, 11(4): 218−225 doi: 10.1111/j.1440-1800.2004.00236.x
|
[110] |
Chung H W, Hou L, Longpre S, et al. Scaling instruction-finetuned language models[J]. arXiv preprint, arXiv: 2210.11416, 2022
|
[111] |
Sun H, Zhang Z, Deng J, et al. Safety assessment of Chinese large language models[J]. arXiv preprint, arXiv: 2304.10436, 2023
|
[112] |
Askell A, Bai Y, Chen A, et al. A general language assistant as a laboratory for alignment[J]. arXiv preprint, arXiv: 2112.00861, 2021
|
[113] |
Lightman H, Kosaraju V, Burda Y, et al. Let’s verify step by step[J]. arXiv preprint, arXiv: 2305.20050, 2023
|
[114] |
Bowman S R, Hyun J, Perez E, et al. Measuring progress on scalable oversight for large language models[J]. arXiv preprint, arXiv: 2211.03540, 2022
|
[115] |
Jiang L, Hwang J D, Bhagavatula C, et al. Can machines learn morality? The Delphi experiment[J]. arXiv preprint, arXiv: 2110.07574, 2021
|
[116] |
Perez E, Ringer S, Lukošiūtė K, et al. Discovering language model behaviors with model-written evaluations[J]. arXiv preprint, arXiv: 2212.09251, 2022
|
[117] |
Street S. Coming to terms with contingency: Human constructivism about practical reason[G]//Constructivism in Practical Philosophy. Oxford, UK: OUP Oxford, 2012: 40−59
|
[118] |
Rawls J. Outline of a decision procedure for ethics[J]. The Philosophical Review, 1951, 60(2): 177−197 doi: 10.2307/2181696
|
[119] |
Rawls J. Rawls’s theory of justice[J]. American Political Science Review, 1975, 69(2): 588−593 doi: 10.2307/1959089
|
[120] |
Kaur S. Moral values in education[J]. IOSR Journal of Humanities and Social Science, 2015, 20(3): 21−26
|
[121] |
Anderson M, Anderson S L, Armen C. Towards machine ethics[C]//Proc of AAAI Workshop on Agent Organizations: Theory and Practice. Menlo Park, CA: AAAI, 2004: 2−7
|
[1] | Li Qinxin, Wu Wenhao, Wang Zhaohua, Li Zhenyu. DNS Recursive Resolution Service Security: Threats, Defenses, and Measurements[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440158 |
[2] | Research on Malicious Domain Detection Technology Based on Semantic Graph Learning[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440375 |
[3] | Wei Jinxia, Long Chun, Fu Hao, Gong Liangyi, Zhao Jing, Wan Wei, Huang Pan. Malicious Domain Name Detection Method Based on Enhanced Embedded Feature Hypergraph Learning[J]. Journal of Computer Research and Development, 2024, 61(9): 2334-2346. DOI: 10.7544/issn1000-1239.202330117 |
[4] | Pan Jianwen, Cui Zhanqi, Lin Gaoyi, Chen Xiang, Zheng Liwei. A Review of Static Detection Methods for Android Malicious Application[J]. Journal of Computer Research and Development, 2023, 60(8): 1875-1894. DOI: 10.7544/issn1000-1239.202220297 |
[5] | Fan Zhaoshan, Wang Qing, Liu Junrong, Cui Zelin, Liu Yuling, Liu Song. Survey on Domain Name Abuse Detection Technology[J]. Journal of Computer Research and Development, 2022, 59(11): 2581-2605. DOI: 10.7544/issn1000-1239.20210121 |
[6] | Yang Wang, Gao Mingzhe, Jiang Ting. A Malicious Code Static Detection Framework Based on Multi-Feature Ensemble Learning[J]. Journal of Computer Research and Development, 2021, 58(5): 1021-1034. DOI: 10.7544/issn1000-1239.2021.20200912 |
[7] | Peng Chengwei, Yun Xiaochun, Zhang Yongzheng, Li Shuhao. Detecting Malicious Domains Using Co-Occurrence Relation Between DNS Query[J]. Journal of Computer Research and Development, 2019, 56(6): 1263-1274. DOI: 10.7544/issn1000-1239.2019.20180481 |
[8] | Dai Hua, Qin Xiaolin, and Bai Chuanjie. A Malicious Transaction Detection Method Based on Transaction Template[J]. Journal of Computer Research and Development, 2010, 47(5): 921-929. |
[9] | Li Qianmu and Liu Fengyu. A Risk Detection and Fault Analysis Method for the Strategic Internet[J]. Journal of Computer Research and Development, 2008, 45(10): 1718-1723. |
[10] | Zhang Xiaoning and Feng Dengguo. Intrusion Detection for Ad Hoc Routing Based on Fuzzy Behavior Analysis[J]. Journal of Computer Research and Development, 2006, 43(4): 621-626. |
1. |
余莎莎,肖辉,郑清,赵幽. 基于威胁情报的DNS助力医院网络安全建设实践. 中国卫生信息管理杂志. 2024(06): 909-914 .
![]() |