This document provides a reproducible analysis of different AI strategies in Cassville Checkers across various game configurations.
Overview¶
Cassville Checkers is a 2-4 player marble racing game where players compete to move all 5 marbles from home, around a circular ring, and into a goal area. Key mechanics include:
Dice-based movement: Roll determines move distance
Capturing (“zapping”): Landing on opponent marbles sends them back to start
Bonus turns: Rolling a 6 grants another turn
Mercy rule: After 3 failed attempts to deploy, bypass the 1/6 requirement
Lap completion: Once a marble completes its lap, it cannot move past its exit point on the ring
Strategies Tested¶
| Strategy | Description | Priority Order |
|---|---|---|
| heuristic_balanced | Balanced approach | goal > staging > ring > home > mercy > skip |
| heuristic_advance | Advance before deploying | goal > ring > staging > home > mercy > skip |
| greedy | Score-based selection | Highest score wins |
| heuristic_deploy | Aggressive deployment | goal > mercy > home > staging > ring > skip |
| random | Uniform random | Random |
| ppo_2p | PPO RL Agent (2-player) | Trained via MaskablePPO |
| ppo_4p | PPO RL Agent (4-player) | Trained via MaskablePPO |
Homogeneous Strategy Games¶
When all players use the same strategy, game length and capture frequency vary significantly.
Average Game Length (turns)¶
| Strategy | 2P Turns | 3P Turns | 4P Turns |
|---|---|---|---|
| random | 259 ± 48 | 470 ± 113 | 828 ± 132 |
| heuristic_advance | 173 ± 14 | 282 ± 45 | 423 ± 61 |
| heuristic_deploy | 213 ± 33 | 483 ± 127 | 923 ± 265 |
| heuristic_balanced | 183 ± 24 | 281 ± 37 | 437 ± 72 |
| greedy | 176 ± 20 | 285 ± 36 | 396 ± 61 |
Average Captures Per Game¶
| Strategy | 2P Captures | 3P Captures | 4P Captures |
|---|---|---|---|
| random | 11.6 | 36.2 | 89.2 |
| heuristic_advance | 1.8 | 6.5 | 15.8 |
| heuristic_deploy | 11.2 | 54.1 | 138.4 |
| heuristic_balanced | 2.4 | 6.0 | 17.2 |
| greedy | 2.1 | 6.8 | 14.6 |
2-Player Head-to-Head Matchups¶
Overall win rates when each strategy plays against all other strategies:
| Strategy | Win Rate |
|---|---|
| greedy | 66.2% |
| heuristic_advance | 62.5% |
| heuristic_balanced | 62.5% |
| heuristic_deploy | 32.5% |
| random | 26.2% |
PPO Agent Performance¶
2-Player PPO Agent¶
Win rates for the PPO agent trained on 2-player games:
| Opponent | PPO Win Rate | Games |
|---|---|---|
| random | 85.0% | 17/20 |
| heuristic_balanced | 45.0% | 9/20 |
| greedy | 50.0% | 10/20 |
4-Player PPO Agent¶
Win rates for the PPO agent trained on 4-player games (fair baseline is 25%):
| Opponent | PPO Win Rate | Fair Baseline |
|---|---|---|
| 3x random | 80.0% | 25% |
| 3x heuristic_balanced | 25.0% | 25% |
| 3x greedy | 5.0% | 25% |
Detailed Capture Analysis¶
Offensive vs defensive performance in 2-player head-to-head matchups:
| Strategy | Captures Made | Captures Suffered | Net |
|---|---|---|---|
| random | 3.0 | 5.3 | -2.3 |
| heuristic_advance | 3.2 | 1.2 | +2.0 |
| heuristic_deploy | 2.6 | 6.7 | -4.1 |
| heuristic_balanced | 3.3 | 1.4 | +2.0 |
| greedy | 3.6 | 1.1 | +2.5 |