Solitaire Winning Probability Environment

This environment is designed to analyze and predict winning probabilities in various solitaire-style games using both theoretical mathematical analysis and empirical simulation.

Overview

The system combines two approaches to determine game winning probabilities:

Theoretical Analysis: Uses AI to derive mathematical formulas for exact probability calculations
Empirical Simulation: Runs Monte Carlo simulations to verify theoretical predictions

Key Components

GamePredictor Class

The core component that handles:

AI-powered probability analysis
Mathematical formula evaluation
Game simulation
Probability comparison between theoretical and empirical results

Features

AI Analysis: Uses LLM to analyze game mechanics and derive mathematical formulas
Formula Evaluation: Supports complex mathematical expressions including:
- Factorials
- Combinations (C(n,r))
- Permutations (P(n,r))
- Standard mathematical operations
Simulation Engine: Runs multiple game simulations to verify theoretical predictions
QA Dataset Generation: Creates training data for AI models by generating question-answer pairs

Reward Function

The environment implements a sophisticated reward function that evaluates the quality of probability predictions:

Base Reward Calculation:
- Compares the predicted probability with the ground truth probability
- Calculates the relative error: 1 - min(abs(gt - predicted) / gt, 2)
- Adds a small bonus of 0.2 for valid predictions
- Clips the final reward between -1 and 1
Length Penalty:
- Applies a length-based penalty for responses that exceed 50% of the maximum token length
- No penalty for responses under the threshold
- Linear scaling of penalty based on response length
- Helps encourage concise and efficient solutions
Validation Checks:
- Verifies proper formula formatting and syntax
- Ensures responses contain valid mathematical expressions
- Handles edge cases and invalid responses gracefully
Quality Metrics:
- Tracks percentage of correct predictions
- Monitors response lengths and quality
- Provides feedback for model improvement

Usage

# Initialize the predictor
predictor = GamePredictor(openai_api_key, openai_api_base)

# Define games to analyze
games = {
    'game_name': game_function,
    # ... more games
}

# Get predictions for all games
results = await predictor.predict_games(games)

# Generate QA dataset
await predictor.generate_qa_csv(games, n_simulations, "output.csv")

Output Format

The system provides comprehensive analysis for each game:

AI's mathematical reasoning
Derived probability formula
Calculated theoretical probability
Simulated empirical probability
Comparison assessment between theory and simulation

Supported Games

The environment includes several example games:

Easy games (1-4)
Card matching games (2-4 cards)
Odd card game

Requirements

Python 3.x
OpenAI API access
Required packages:
- openai
- asteval
- asyncio

Purpose

This environment serves multiple purposes:

Educational: Demonstrates probability theory in practical game scenarios
Research: Provides a framework for analyzing game mechanics
AI Training: Generates datasets for training AI models in probability analysis
Verification: Validates theoretical probability calculations through simulation

Contributing

New games can be added by implementing game functions that return a boolean indicating win/loss. The system will automatically analyze and provide probability predictions for any valid game implementation.