Abstract:
With the rapid development of large language models, their safety has become a growing concern among researchers and the public. To prevent potential harm in collaboration, aligning these models’ judgments with human moral values in daily scenarios is essential. A key challenge is ensuring that large language models can adaptively adjust or reassess rules in moral judgment, like humans, to maintain consistency with human morals in various contexts. Inspired by psychological and cognitive science research on the emotional and cognitive influences on human moral judgments, this study leverages the strengths of large language models in cognitive reasoning and emotional analysis. We develop an approach that emulates the interaction between emotional and cognitive judgment in human moral reasoning, thus enhancing these models’ moral judgment capabilities. Experimental results demonstrate the effectiveness of our approach in this task. Overall, this study not only presents an innovative approach to the moral judgment of large language models but also highlights the importance of integrating psychological and cognitive science theories in this field, setting a foundation for future research.