多视角看大模型安全及实践

王笑尘; 张坤; 张鹏

doi:10.7544/issn1000-1239.202330955

多视角看大模型安全及实践

Large Model Safety and Practice from Multiple Perspectives

摘要

摘要: 随着人工智能领域大模型（large model）的广泛应用，大模型，尤其是大语言模型（large language model，LLM）的安全问题受到了广泛关注. 大模型作为一种新兴技术，与之相关的安全态势分析以及安全体系建设均亟待挖掘与探索. 从社会关系以及技术应用2个视角，分析了大模型安全的整体趋势. 同时，基于大模型自身的特点，梳理了大模型安全能力建设的实践思路，为大模型研发、大模型应用构建提供了安全体系构建的参考方案. 介绍的大模型安全能力实践方案包括安全评估基准建设、模型价值观对齐方法、模型线上服务安全系统建设3个部分.

Abstract: With the widespread application of big models in the field of artificial intelligence, the security issues of large models, especially large language models, have received attention. As an emerging technology, the security situation analysis of large models and the construction of security systems need further exploration. We analyze the overall trend of large model security from two perspectives: society and technology application. Based on the characteristics of large models, we sort out the practice of large model security building, and provide a reference plan for building a security system for large model development and large model application construction. The large model security practice plan introduced in this article includes three parts: security benchmark construction, model values alignment method, and model online service security system construction.

HTML全文

参考文献(36)

施引文献

资源附件(0)