• Research and implementation of intelligent decision based on a priori knowledge and DQN algorithms in wargame environment

      Sun, Yuxiang; Yuan, Bo; Zhang, Tao; Tang, Bojian; Zheng, Wanwen; Zhou, Xianzhong; University of Derby; Nanjing University, China (MDPI AG, 2020-10-13)
      The reinforcement learning problem of complex action control in a multi-player wargame has been a hot research topic in recent years. In this paper, a game system based on turn-based confrontation is designed and implemented with state-of-the-art deep reinforcement learning models. Specifically, we first design a Q-learning algorithm to achieve intelligent decision-making, which is based on the DQN (Deep Q Network) to model complex game behaviors. Then, an a priori knowledge-based algorithm PK-DQN (Prior Knowledge-Deep Q Network) is introduced to improve the DQN algorithm, which accelerates the convergence speed and stability of the algorithm. The experiments demonstrate the correctness of the PK-DQN algorithm, it is validated, and its performance surpasses the conventional DQN algorithm. Furthermore, the PK-DQN algorithm shows effectiveness in defeating the high level of rule-based opponents, which provides promising results for the exploration of the field of smart chess and intelligent game deduction
    • Research on Action Strategies and Simulations of DRL and MCTS-based Intelligent Round Game

      Sun, Yuxiang; Yuan, Bo; Zhang, Yongliang; Zheng, Wanwen; Xia, Qingfeng; Tang, Bojian; Zhou, Xianzhong; Nanjing University, China; University of Derby; Army Engineering University, Nanjing, China (Springer Science and Business Media LLC, 2021-06-16)
      The reinforcement learning problem of complex action control in multiplayer online battlefield games has brought considerable interest in the deep learning field. This problem involves more complex states and action spaces than traditional confrontation games, making it difficult to search for any strategy with human-level performance. This paper presents a deep reinforcement learning model to solve this problem from the perspective of game simulations and algorithm implementation. A reverse reinforcement-learning model based on high-level player training data is established to support downstream algorithms. With less training data, the proposed model is converged quicker, and more consistent with the action strategies of high-level players’ decision-making. Then an intelligent deduction algorithm based on DDQN is developed to achieve a better generalization ability under the guidance of a given reward function. At the game simulation level, this paper constructs Monte Carlo Tree Search Intelligent Decision Model for turn-based antagonistic deduction games to generate next-step actions. Furthermore, a prototype game simulator that combines offline with online functions is implemented to verify the performance of proposed model and algorithm. The experiments show that our proposed approach not only has a better reference value to the antagonistic environment using incomplete information, but also accurate and effective in predicting the return value. Moreover, our work provides a theoretical validation platform and testbed for related research on game AI for deductive games.