Discussion:

Player A and player B alternate flipping a coin with bias p. When it is player A's turn, player A receives reward alpha on tails and alpha+beta on heads. Likewise, when it is play B's turn, player B receives reward alpha on tails and alpha+beta on heads. The first player to have an accumulated award above limit wins.

Player A has the advantage, hence the probably that A wins is always greater than 50%. The value p* is the approximate bias of the coin such that the probability of A's winning is minimized.

The python script will calculate p* and run a set of trials randomized simulation, providing the sample probability that A has won, according to the fraction of games among the trials that were winning for A, labeled the advantage. Input positive floating point alpha, beta and limit, and an positive integer value for trials.

See: How to lose with least probability, Robert Chen, Ayaka Guo, and Alan Zame.

Input parameters:
alpha
beta
limit
trials