Welcome to the repository for Assignment #2 for CSCI-561, USC's Foundations of Artificial Intelligence graduate course. This project implements a 5x5 Go playing agent using a modified Monte Carlo Tree Search Algorithm.
- Go is an ancient board game that originated in China over 4,500 years ago. The game's objective is to claim more territory on the board than your opponent by placing and capturing stones. To learn more, click here.
-
Monte Carlo Tree Search (MCTS) is a search algorithm commonly used in decision-making problems, especially in games like Go and Chess. Here's a brief overview:
-
Selection: Start from the root of the tree and traverse down to a leaf node by selecting the node that has the highest value according to a specific policy (usually the UCT formula).
-
Expansion: Once a leaf node is reached, expand it by adding one or more child nodes representing possible moves or decisions.
-
Simulation: Perform a random simulation (rollout) from one of the child nodes until a terminal state (like the end of the game) is reached.
-
Backpropagation: Once the simulation is complete, propagate the result (win or loss) back up the tree, updating the value and visit count of each node along the path.
-
-
The process repeats, selecting nodes to expand and simulating outcomes until a computational budget (like time or number of simulations) is reached. The move with the best value or the most visits at the root is then chosen as the best move. To learn more about MCTS, click here.
- While Go is typically played on a 19x19 grid, the 5x5 version simplifies it to a smaller board with only 5 squares in each dimension. The basic rules of this version, which the agent follows, are:
- Players aren't allowed to play "suicide moves," meaning stones cannot be placed such that the placed stone or its unit has no liberties.
- A KO cannot be recaptured.
- A maximum of 24 moves can be played per game.
- Once the game is complete, the player with the highest score is determined the winner:
- Black's Score = # of stones on the board
- White's Score = # of stones on the board + komi value of 2.5
All other rules follow traditional Go rules.
- The program expects to read a text input in the following format:
- Line 1: Value of "1" or "2" denoting the color the agent is playing as (1 = Black; 2 = White).
- Lines 2-6: Board layout after agent's last move.
- Lines 7-11: Board layout after opponent's last move.
- Note: In lines 2-11, "1"s represent Black's stones, "2"s represent White's stones, and "0"s represent empty spaces.
- Once the agent has decided on the move it wants to play, it will generate a file titled "output.txt" in the "io" directory denoting the position it wants to place its stone in the following format:
- Line 1: [row #],[column #] (0 indexed)
- This project's implementation of MCTS differs in several ways from the base algorithm. Here are some of the key differences:
- Heuristic Functions:
- Two heuristic functions are used: one to evaluate the value of a move for a player and another to evaluate the value of a board for a player. The move evaluation heuristic guides MCTS in searching more "promising" nodes. Helping to prune the search space and find the best move faster. The board evaluation heuristic is used to score the board at the end of simulations, where the score is then used during backpropagation.
- Modified UCT Function:
- The UCT function is modified to take into account the move and board scores for a node. This helps the agent select the best node to expand.
- Weights:
- Weights have been added throughout the program, which can be fine-tuned to adjust the playing style of the agent. They are also designed to change based on the stage of the game, which can further improve the agent's performance.
- Choosing Final Move:
- The final move is chosen using a combination of the ratio of winning simulations to total simulations and the value of the board to total simulations. This helps the agent choose the best move, even if it hasn't been simulated as much as other moves.
- Heuristic Functions:
Note: Feel free to visit each link below for a more in-depth look at the code with additional descriptions!
Main (main.py)
- This file orchestrates the execution of the program. Its main responsibilities include:
- Initializing the agent and reading the input file.
- Determining if the game is in an opening stage and returning an optimal move if so.
- Adjusting weights to play more aggressively as Black.
- Running the MCTS loop (Selection, Expansion, Simulation, and Backpropagation) until the computational budget is reached.
- Choosing the optimal move and writing the move to a file.
Go Agent (go_agent)
Heuristics (heuristics)
-
Evaluate Move (
evaluate_move.py)- This file contains the heuristic functions used to determine the value of a move for a player.
-
Evaluate Board (
evaluate_board.py)- This file contains the heuristic functions used to determine the value of a board for a player.
Utils (utils)
-
Generate Boards (
generate_boards.py)- This is where the logic for generating all possible boards given a board and the current player is held.
-
Helper Functions (
helper_functions.py)- This file contains helper functions used throughout the program.
MCTS (mcts.py)
- This file holds the
MonteCarloTreeSearchclass, which defines a custom Monte Carlo Tree Search Algorithm for the 5x5 version of Go. The methods include:init_treeselectionexpansionsimulationbackpropagationupdate_player_1_weightscalc_child_value_modified_UCT_evaluate_move_evaluate_board
MCTS Config (mcts_config.py)
- This file configures various hyperparameters for the agent, which control its playing style.
Node (node.py)
- This file contains the
_Nodeclass, which represents a node in the tree. It encapsulates a particular game state and provides attributes required for game tree evaluations. These attributes include:player:the player playing the moveboard:the board stateprevious_board:the previous board stateparent_node:the parent nodechildren:the node's children nodesmove_score:the value of moving into this node from the parent node from the perspective of the parent node's playerboard_score:the value of the board from the perspective of the root node's playertotal_value:the total value of the node aggregated during backpropagationsimulated:a boolean indicating if the node has been simulatedsimulation_visits:the number of times the node has been visited during the simulationswinning_simulation_visits:the number of times the node has been visited during a winning simulation for the parent's player
- Here's how to clone the repository and run the main program on your local machine.
git clone [repository-URL]cd [repository-name]- Execute the main program with the following command:
python main.pyNow, the program should be up and running!