Journal of Nantong University (Natural Science Edition), Volume. 24, Issue 2, 22(2025)

A Q-learning method for creating a Hex opening library

XU Zhifan1, LI Yuan1、*, WANG Jingwen1, LI Zhuoxuan2, and CAO Yiding3
Author Affiliations
  • 1School of Science, Shenyang University of Technology, Shenyang 110870, China
  • 2School of Mathematics, Southeast University, Nanjing 211189, China
  • 3Baiyang Era (Beijing) Technology Co., Ltd., Beijing 100089, China
  • show less

    Hex is a perfect-information board game, and its opening library-an essential component of the game system — has traditionally been generated based on human expertise and Monte Carlo tree search (MCTS) algorithms. However, this approach is computationally expensive and may not consistently ensure accuracy. This study proposes a self-play method based on Q-learning for the efficient construction of Hex opening libraries. The proposed method employs multi-threaded simulations and an improved upper confidence bound applied to trees (UCT) algorithm to identify promising opening moves. An enhanced ε-greedy strategy is incorporated to improve the convergence rate of the Q-learning algorithm. To further improve performance, Q-values are integrated into the upper confidence bound(UCB) formula as prior knowledge, which is intended to enhance decision-making accuracy during gameplay. Experimental results indicate that after 3 000 training iterations, the Q-values across the board converge, suggesting the method's potential for stable policy learning. In comparative evaluations, the generated opening library achieved a 62.9% average win rate against the improved UCT algorithm. When Q-values were used as prior input to the UCB formula, the average win rate increased to 75.9%. The method was also applied in the Chinese Computer Game Competition, where the implementation received a first-place award, supporting the practical applicability of the approach.

    Tools

    Get Citation

    Copy Citation Text

    XU Zhifan, LI Yuan, WANG Jingwen, LI Zhuoxuan, CAO Yiding. A Q-learning method for creating a Hex opening library[J]. Journal of Nantong University (Natural Science Edition), 2025, 24(2): 22

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: May. 29, 2024

    Accepted: Aug. 25, 2025

    Published Online: Aug. 25, 2025

    The Author Email: LI Yuan (syliyuan@sut.edu.cn)

    DOI:10.12194/j.ntu.20240529001

    Topics