Tic Tac Toe Python的Minimax算法

本文介绍了Tic Tac Toe Python的Minimax算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有点了解minimax算法如何适用于井字python，但我不知道如何在Python中实际编码...这就是我到目前为止所掌握的：

I kind of understand how the minimax algorithm works for Tic Tac Toe python but I have no idea how to actually code it in Python... this is what I have so far:

from copy import deepcopy

class TicTacToeBrain :

    def __init__(self, player = "x") :
        self._squares = {}
        self._copySquares = {}
        self._winningCombos = (
        [0, 1, 2], [3, 4, 5], [6, 7, 8],
        [0, 3, 6], [1, 4, 7], [2, 5, 8],
        [0, 4, 8], [2, 4, 6])

    def createBoard(self) :
        for i in range(9) :
            self._squares[i] = None
        print(self._squares)

    def showBoard(self) :
        print(self._squares[0], self._squares[1], self._squares[2])
        print(self._squares[3], self._squares[4], self._squares[5])
        print(self._squares[6], self._squares[7], self._squares[8])

    def getAvailableMoves(self) :
        self._availableMoves = []
        for i in range(9) :
            if self._squares[i] == None :
                self._availableMoves.append(i)
        return self._availableMoves

    def makeMove(self, position, player) :
        self._squares[position] = player
        self.showBoard()

    def complete(self) :
        if None not in self._squares.values() :
            return True
        if self.getWinner() != None :
            return True
        return False

    def getWinner(self) :
        for player in ("x", "o") :
            for combos in self._winningCombos :
                if self._squares[combos[0]] == player and self._squares[combos[1]] == player and self._squares[combos[2]] == player :
                    return player
        if None not in self._squares.values() :
            return "tie"
        return None

    def getEnemyPlayer(self, player) :
        if player == "x" :
            return "o"
        return "x"

    def minimax(self, node, player, depth = 0, first = True) :
        if first :
            best = 0
            self._copySquares = deepcopy(self._squares)

        if node.complete() :
            if node.getWinner() == "x" :
                self._squares = self._copySquares
                return -1 - depth
            elif node.getWinner() == "tie" :
                self._squares = self._copySquares
                return 0
            elif node.getWinner() == "o" :
                self._squares = self._copySquares
                return 1 + depth
            best = None
        for move in node.getAvailableMoves() :
            depth += 1
            node.makeMove(move, player)
            print()
            val = self.minimax(node, node.getEnemyPlayer(player), depth, first = False)
            print(val)
            if player == "o" :
                if val > best :
                    best = val
            else :
                if val < best :
                    best = val
            return best
            print()
            print()

    def printCopy(self) :
        print(self._copySquares)

但是，它永远不会打印出所有场景...。请有人帮忙！！

However, it never prints out all the scenarios....someone please help !!! This is due for a project on Monday..

推荐答案

一些问题：

执行从 for 循环中退出，并在处返回第一次迭代：这还为时过早，因为您永远都无法测试任何其他可用的动作。 返回应该在循环后发生。

Execution breaks out of the for loop with a return at the first iteration: this is premature, as you never get to test any of the other available moves. That return should happen after the loop.

在 for 循环的每次迭代中增加深度值。而是将 depth + 1 传递给递归调用，以便当您从中返回时，您将以相同的深度继续。

It is wrong to increment the depth value in each iteration of the for loop. Instead, pass depth+1 to the recursive call, so that when you return from that, you continue at the same depth.

在递归调用之前完成的移动必须在其之后立即执行，否则 for 循环的下一次迭代将不会从同一循环开始

The move done before the recursive call, must be taken back right after it, otherwise the next iteration of the for loop will not start from the same position.

best 的值需要初始化为每不仅在递归树的顶部调用minimax方法。此初始值不应为0，因为当前用户的最佳值可能会比0差。因此，您需要将其初始化为一个极差的值。

The value for best needs to be initialised at every call of the minimax method, not only at the top of the recursion tree. This initial value should not be 0, because the best value for the current user might be worse than 0. So you need to initialise it to an extreme bad value.

minimax方法不会返回最佳移动，而只会返回评估值。由于该方法的全部目的是告诉您应该进行哪个动作，因此您需要同时使用两者。因此，让该方法返回具有两个值的元组：评估值和生成该值的移动。

The minimax method does not return the best move, only the evaluation value. Since the whole purpose of the method is to tell you which move should be played, you need both. So let the method return a tuple with both values: the evaluation value and the move that generated that value.

有些非关键问题：

当您想延迟不可避免的损失或加快强制胜利时，公式为计算玩家获胜时的值时，距离越远，距离0越近，而不是越近。因此，需要对该公式进行更改。

As you want to delay an inevitable loss, or speed-up a forced win, the formula for calculating the value when a player wins should get closer to 0 the further away it is, not the closer it is. So a change is needed in that formula.

由于您应该通过收回动作来恢复电路板，因此无需使用重复的电路板，并且重复的正方形。如果所有代码都编写正确，则在完成minimax方法的顶层调用之后，开发板应与该调用之前处于完全相同的状态。

Since you should restore the board by taking back moves, there is no need to work with a duplicated board, and duplicated squares. If all is coded well, after the top call of the minimax method has completed, the board should be in exactly the same state as it was before that call.

如果您不使用 None 来代替空的正方形，而是使用单个字符（例如。），则面板的打印效果会更好。因此，在您引用空平方值的任何地方，都应放置该字符。

The board prints better when you don't use None for empty squares, but a single character, like a ".". So everywhere where you refer to empty square values, put that character.

您有 print（）在这里和那里为了分开输出。在方法 showBoard 中放入一个，其余代码就可以不用它们。

You have print() here and there in order to separate the output. Put one in the method showBoard, and the rest of your code can do without them.

给出以上几点，您不需要或 first 参数> minimax 方法。

Given some of the above points, you don't need the node nor the first parameter to the minimax method.

此处为带注释的更正版本。我将您的原始行保留在原处，但在需要的地方将其注释掉。

Here is a commented, corrected version. I left your original lines in place, but commented them out where needed.

# *** not needed:
# from copy import deepcopy

class TicTacToeBrain :

    def __init__(self, player = "x") :
        self._squares = {}
        self._copySquares = {}
        self._winningCombos = (
        [0, 1, 2], [3, 4, 5], [6, 7, 8],
        [0, 3, 6], [1, 4, 7], [2, 5, 8],
        [0, 4, 8], [2, 4, 6])

    def createBoard(self) :
        for i in range(9) :
            # *** use a single character, ... easier to print
            self._squares[i] = "."
        print(self._squares)

    def showBoard(self) :
        # *** add empty line here, instead of in minimax
        print ()
        print(self._squares[0], self._squares[1], self._squares[2])
        print(self._squares[3], self._squares[4], self._squares[5])
        print(self._squares[6], self._squares[7], self._squares[8])


    def getAvailableMoves(self) :
        self._availableMoves = []
        for i in range(9) :
            # *** see above
            if self._squares[i] == "." :
                self._availableMoves.append(i)
        return self._availableMoves

    def makeMove(self, position, player) :
        self._squares[position] = player
        self.showBoard()

    def complete(self) :
        # *** see above
        if "." not in self._squares.values() :
            return True
        if self.getWinner() != None :
            return True
        return False

    def getWinner(self) :
        for player in ("x", "o") :
            for combos in self._winningCombos :
                if self._squares[combos[0]] == player and self._squares[combos[1]] == player and self._squares[combos[2]] == player :
                    return player
        # *** see above
        if "." not in self._squares.values() :
            return "tie"
        return None

    def getEnemyPlayer(self, player) :
        if player == "x" :
            return "o"
        return "x"

    # *** no need for `node` argument, nor `first`
    # *** use `self` instead of `node` in all this method
    def minimax(self, player, depth = 0) :
        # *** not needed
        # if first :
            # best = 0
            # *** not needed
            # self._copySquares = deepcopy(self._squares)
        # *** always start with initilisation of `best`, but with worst possible value
        #     for this player
        if player == "o":
            best = -10
        else:
            best = 10
        if self.complete() :
            if self.getWinner() == "x" :
                # *** don't do this, you may still need the position to try other moves
                # self._squares = self._copySquares
                # *** value should be closer to zero for greater depth!
                # *** expect tuple return value
                return -10 + depth, None
            elif self.getWinner() == "tie" :
                # self._squares = self._copySquares
                # *** expect tuple return value
                return 0, None
            elif self.getWinner() == "o" :
                # self._squares = self._copySquares
                # *** value should be closer to zero for greater depth!
                # *** expect tuple return value
                return 10 - depth, None
            # *** Execution can never get here
            # best = None
        for move in self.getAvailableMoves() :
            # *** don't increase depth in each iteration, instead pass depth+1 to
            #    the recursive call
            # depth += 1
            self.makeMove(move, player)
            # *** pass depth+1, no need for passing `node` nor `first`.
            # *** expect tuple return value
            val, _ = self.minimax(self.getEnemyPlayer(player), depth+1)
            print(val)
            # *** undo last move
            self.makeMove(move, ".")
            if player == "o" :
                if val > best :
                    # *** Also keep track of the actual move
                    best, bestMove = val, move
            else :
                if val < best :
                    # *** Also keep track of the actual move
                    best, bestMove = val, move
            # *** don't interrupt the loop here!
            # return best
            # *** this is dead code:
            # print()
            # print()
        # *** Also keep track of the actual move
        return best, bestMove

    def printCopy(self) :
        print(self._copySquares)

以下是有关如何使用该类的示例：

Here is an example on how you would use the class:

game = TicTacToeBrain()
game.createBoard()
game.makeMove(4, "o")
game.makeMove(3, "x")
val, bestMove = game.minimax("o")
print('best move', bestMove) # --> 0 is a winning move.

查看它在 ...等一下。

See it run on eval.in... wait for it.

我将不提供代码，但您可以：

I will not provide the code for this, but you could:

保持轮到谁在 self.player 中。这样一来，您不必将播放器作为minimax的参数传递，这样可以避免错误。它还使构造函数参数有用-当前您不对其进行任何操作。

Keep track whose turn it is in self.player. That way you don't have to pass the player as argument to minimax, which will avoid mistakes. Also it makes the constructor argument useful -- currently you don't do anything with it.

添加方法 bestMove 会简单地调用 minimax ，但只会返回最佳移动，而不是返回值。

Add a method bestMove that will simply call minimax, but will only return the best move, not the value. That will be easier to manage.

使用alpha-beta修剪，这样您就可以停止评估其他举动，当您显然无法提高已达到的价值时

Use alpha-beta pruning, so that you stop evaluating other moves, when it is clear you cannot improve the value already achieved up in the recursion tree.

这篇关于Tic Tac Toe Python的Minimax算法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！