Robert Chen and Burton Rosenberg
University of Miami
The player starts with a fortune K and an urn with a known number of P plus marbles and M minus marbles. At each turn, the player can stop and take with him his fortune, or ante one unit of fortune and draw a marble from the urn. A plus marble rewards the player with his wager and one unit of winning. A minus marble and the player loses his wager. The marble is discarded.
The states of the game are given by (m,p,k), the current number of minus marbles, plus marbles, and current fortune. Arational player will stop if the expected value of the game given the current state is less than the current fortune:If the maximum is achieved is k, the player stops, else this formula recursively calculates the value of V.
The player also stops if ruined, V(m,p,0)=0, or there is nothing more to gain, V(m,0,k)=k.
The values of V are calculated using dynamic programming, working backwards along nodes of constant m+p, starting at m+p=0 where V(0,0,K-M+P) = K-M+P, and increasing by one until m+p=M+P.