Computer Engineering, Volume. 51, Issue 8, 354(2025)
HPL-MxP Multiple lookahead Optimization for Kunpeng Processors
The HPL-MxP benchmark program is widely used for measuring the computational power of supercomputers in mixed-precision computing. Subject to the parallel implementation algorithm of this program, the selection of the matrix Numerical Block (NB) value of the matrix block size is a tradeoff problem that must consider matrix multiplication efficiency and load balancing. To solve this problem, this paper presents an optimization study on the Kunpeng 920 system and proposes a multi-level lookahead optimization strategy: small NB values are used for matrix chunking to achieve better load balancing, and equivalent NB values are improved by merging multiple rounds of matrix multiplication updates to achieve load balancing and high matrix multiplication efficiency. To realize a multi-level lookahead optimization scheme, this study reconstructs the Panel storage mode, designs a fine-grained computing and communication pipeline, and expands the HPL-MxP source program interface. A single-double precision hybrid test on the Kunpeng 920 multi-node platform shows that HPL-MxP can effectively solve the trade-off problem of NB values under multi-level lookahead optimization and does not incur significant additional overhead compared with the single-level lookahead strategy.
Get Citation
Copy Citation Text
GAO Ang, WANG Yinshan, YAN Wen, SONG Changcheng, WANG Long, YAO Erlin. HPL-MxP Multiple lookahead Optimization for Kunpeng Processors[J]. Computer Engineering, 2025, 51(8): 354
Category:
Received: Nov. 3, 2023
Accepted: Aug. 26, 2025
Published Online: Aug. 26, 2025
The Author Email: WANG Yinshan (wangyinshan@ict.ac.cn)