Abstract:
                                      Conversational emotion recognition is the task of classifying emotions based on conversations. The conversation data are characterized by colloquial language and a wide range of topics, with semantic similarities among labels. Colloquial language exhibits issues such as word ambiguity and the omission of semantic information, emphasizing the importance of common sense and grammatical knowledge in conversational emotion recognition tasks, and these factors enable the model to accurately capture semantic information. Moreover, the current challenge lies in the variations in text richness and the frequency of emotion transfer across different dialogue scenarios, which result in suboptimal classification performance. We propose CK-ERC model to address these challenges. In the pre-training phase, CK-ERC model extracts structured data to incorporate common sense knowledge graphs and grammatical knowledge graphs, aiding the model in accurately capturing colloquial information. In the fine-tuning phase, a supervised contrast learning task is introduced to help the model identify similar emotional labels. Furthermore, a dynamic threshold-based curriculum learning strategy is designed for training and optimizing the model based on text richness (from high to low) and emotion transfer frequency (from low to high). CK-ERC model demonstrates superior performance in various conversation modes, including two-person conversation, multi-person conversation, simulated conversation, and daily conversation. Particularly, CK-ERC model achieves the best performance on MELD and EmoryNLP datasets.