Abstract:
Wafer-scale computer integrates multiple chiplets through advanced packaging technologies, overcoming traditional chip area limitations to achieve computational power scaling. However, existing domain-specific designs struggle to meet generalized computing requirements. In this study, we propose Yingtian-Lake, which is a wafer-scale general-purpose computer targeting workload characteristics of high-performance computing and intelligent computing scenarios. First, a decoupled computing module-interposer architecture design with standardized I/O interfaces enables multi-modal computing module compatibility. Second, a reconfigurable wafer-scale network employing dynamic topology adaptation technology accommodates diverse traffic patterns. Third, a fault-aware tolerant routing algorithm ensures service continuity during computing unit failures. Experimental results demonstrate that the proposed reconfigurable network achieves second-level topology switching latency. The prototyped 16-module system fabricated with TSMC 28 nm process shows 1.45 times and 1.78 times energy efficiency improvements in high-performance linear algebra computations and deep learning inference tasks respectively, while delivering petaflops-level performance on a single wafer. This breakthrough architecture validates the technical feasibility of universal wafer-scale systems, establishing a scalable hardware foundation for next-generation heterogeneous computing platforms.