Cassville Checkers Strategy Analysis

This document provides a reproducible analysis of different AI strategies in Cassville Checkers across various game configurations.

Overview¶

Cassville Checkers is a 2-4 player marble racing game where players compete to move all 5 marbles from home, around a circular ring, and into a goal area. Key mechanics include:

Dice-based movement: Roll determines move distance
Capturing (“zapping”): Landing on opponent marbles sends them back to start
Bonus turns: Rolling a 6 grants another turn
Mercy rule: After 3 failed attempts to deploy, bypass the 1/6 requirement
Lap completion: Once a marble completes its lap, it cannot move past its exit point on the ring

Strategies Tested¶

Strategy	Description	Priority Order
heuristic_balanced	Balanced approach	goal > staging > ring > home > mercy > skip
heuristic_advance	Advance before deploying	goal > ring > staging > home > mercy > skip
greedy	Score-based selection	Highest score wins
heuristic_deploy	Aggressive deployment	goal > mercy > home > staging > ring > skip
random	Uniform random	Random
ppo_2p	PPO RL Agent (2-player)	Trained via MaskablePPO
ppo_4p	PPO RL Agent (4-player)	Trained via MaskablePPO

Homogeneous Strategy Games¶

When all players use the same strategy, game length and capture frequency vary significantly.

Average Game Length (turns)¶

Strategy	2P Turns	3P Turns	4P Turns
random	259 ± 48	470 ± 113	828 ± 132
heuristic_advance	173 ± 14	282 ± 45	423 ± 61
heuristic_deploy	213 ± 33	483 ± 127	923 ± 265
heuristic_balanced	183 ± 24	281 ± 37	437 ± 72
greedy	176 ± 20	285 ± 36	396 ± 61

Average Captures Per Game¶

Strategy	2P Captures	3P Captures	4P Captures
random	11.6	36.2	89.2
heuristic_advance	1.8	6.5	15.8
heuristic_deploy	11.2	54.1	138.4
heuristic_balanced	2.4	6.0	17.2
greedy	2.1	6.8	14.6

2-Player Head-to-Head Matchups¶

Overall win rates when each strategy plays against all other strategies:

Strategy	Win Rate
greedy	66.2%
heuristic_advance	62.5%
heuristic_balanced	62.5%
heuristic_deploy	32.5%
random	26.2%

PPO Agent Performance¶

2-Player PPO Agent¶

Win rates for the PPO agent trained on 2-player games:

Opponent	PPO Win Rate	Games
random	85.0%	17/20
heuristic_balanced	45.0%	9/20
greedy	50.0%	10/20

4-Player PPO Agent¶

Win rates for the PPO agent trained on 4-player games (fair baseline is 25%):

Opponent	PPO Win Rate	Fair Baseline
3x random	80.0%	25%
3x heuristic_balanced	25.0%	25%
3x greedy	5.0%	25%

Detailed Capture Analysis¶

Offensive vs defensive performance in 2-player head-to-head matchups:

Strategy	Captures Made	Captures Suffered	Net
random	3.0	5.3	-2.3
heuristic_advance	3.2	1.2	+2.0
heuristic_deploy	2.6	6.7	-4.1
heuristic_balanced	3.3	1.4	+2.0
greedy	3.6	1.1	+2.5