Welcome, guest | Sign In | My Account | Store | Cart

Computes the strategy oddments for two-player zero-sum games of perfect information. Uses a robust, iterative approximation that can handle dominance, non-square payoff matrices, and games without a saddle-point.

Python, 34 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
''' Approximate the strategy oddments for 2 person zero-sum games of perfect information.

Applies the iterative solution method described by J.D. Williams in his classic
book, The Compleat Strategyst, ISBN 0-486-25101-2.   See chapter 5, page 180 for details. '''

from operator import add, neg

def solve(payoff_matrix, iterations=100):
    'Return the oddments (mixed strategy ratios) for a given payoff matrix'
    transpose = zip(*payoff_matrix)
    numrows = len(payoff_matrix)
    numcols = len(transpose)
    row_cum_payoff = [0] * numrows
    col_cum_payoff = [0] * numcols
    colpos = range(numcols)
    rowpos = map(neg, xrange(numrows))
    colcnt = [0] * numcols
    rowcnt = [0] * numrows
    active = 0
    for i in xrange(iterations):
        rowcnt[active] += 1        
        col_cum_payoff = map(add, payoff_matrix[active], col_cum_payoff)
        active = min(zip(col_cum_payoff, colpos))[1]
        colcnt[active] += 1       
        row_cum_payoff = map(add, transpose[active], row_cum_payoff)
        active = -max(zip(row_cum_payoff, rowpos))[1]
    value_of_game = (max(row_cum_payoff) + min(col_cum_payoff)) / 2.0 / iterations
    return rowcnt, colcnt, value_of_game

###########################################
# Example solutions to two pay-off matrices
      
print solve([[2,3,1,4], [1,2,5,4], [2,3,4,1], [4,2,2,2]])   # Example on page 185
print solve([[4,0,2], [6,7,1]])                             # Exercise 2 number 3

This is a general purpose tool for solving all of the games in the J.D. Williams book.

The approach is to gather strategy selection statistics from a succession of plays where each player makes the best countermove based upon the opponent's cumulative history of plays.

The output is the strategy oddment (the relative ratio that a player should play each strategy). For instance, given the payoff matrix: <pre> A B C _ _ _ D | 4 0 2 E | 6 7 1 </pre>

The [75, 25] return value means that strategy D should be played 75% of the time and that E should be played 25% of the time. Likewise, the opponent's oddment of [0, 13, 87] means that the opponent should never play A and should play B and C in about a 1:7 ratio (approximately). The value of the game is about 1.765.

Given an m-by-n matrix, the running time is O((m+n)*iterations). So, this function performs respectably even with very large payoff matrices.

1 comment

ashish 6 years, 11 months ago  # | flag

on running the code, there is an error in python 3.4 "object of type 'zip' has no len()" can you please check for that regards, Ashish