Abstract:
Recent advances in large language models (LLMs) have significantly elevated requirements for data quality in practical applications. Real-world scenarios often involve heterogeneous data from multiple correlated domains. Yet cross-domain data integration remains challenging due to privacy and security concerns that prohibit centralized sharing, thereby limiting LLM’s effective utilization. To address this critical issue, we propose a novel framework integrating LLM with knowledge graphs (KGs) for cross-domain heterogeneous data query. Our approach presents a systematic governance solution under the LLM-KG paradigm. First, we employ domain adapters to fuse cross-domain heterogeneous data and construct corresponding KG. To enhance query efficiency, we introduce knowledge line graphs and develop a homogeneous knowledge graph extraction (HKGE) algorithm for graph reconstruction, significantly improving cross-domain data governance performance. Subsequently, we propose a trusted subgraph matching algorithm TrustHKGM to ensure high-confidence multi-domain queries through confidence computation and low-quality node filtering. Finally, we design a multi-domain knowledge line graph prompting (MKLGP) algorithm to enable efficient and trustworthy cross-domain query answering within the LLM-KG framework. Extensive experiments on multiple real-world datasets demonstrate the superior effectiveness and efficiency of our approach compared with state-of-the-art solutions.