Abstract:
Addressing the challenges of significant modeling errors in long-sequence dependencies and insufficient utilization of domain prior knowledge for energy consumption prediction in High Performance Computing (HPC) jobs, this paper proposes a retrieval-augmented method that employs a cross-attention mechanism to guide a Temporal Convolutional Network (TCN) for predicting HPC job energy consumption. The approach constructs a Retrieval-Augmented Knowledge Base (RAG KB) leveraging historical job data and accumulated operational knowledge. By dynamically adjusting the dilation rate based on the spectral characteristics of job energy consumption sequences and the energy sensitivity of operators, the model enhances its focus on critical temporal features. The job-specific knowledge embedded in the knowledge base is transferred to the energy consumption prediction task through cross-attention computation, enabling adaptive weight adjustment across different time steps. This allows for precise capture of energy consumption fluctuations in supercomputing jobs and improves the predictive accuracy of the TCN model in this domain. Experimental results demonstrate that, compared to traditional TCN models, the proposed method reduces the Mean Absolute Percentage Error (MAPE) to 8.6%-11.3% and the Symmetric Mean Absolute Percentage Error (SMAPE) to 8.7%-13.7%. This approach effectively integrates domain prior knowledge and attention guidance into the temporal convolutional network, enhancing the model's adaptability to business scenarios while maintaining computational efficiency. It provides an operational method for supercomputing energy efficiency management that combines theoretical depth with engineering value, offering a viable solution for optimizing energy consumption in high-performance computing environments. The integration of retrievable operational knowledge represents a significant advancement in developing context-aware energy management systems for large-scale computational facilities.