Skip to content

Usage Guide

This guide covers installation, basic usage, and how to run the training algorithms.

Important

The Agent and GridWorldEnv classes are designed to be used as-is. You should not modify these core classes. Instead, configure them through their parameters and use the provided training scripts.


Installation

Clone the Repository

git clone https://github.com/ProValarous/Predator-Prey-Archetype-Gridworld-Environment
cd Predator-Prey-Archetype-Gridworld-Environment

Install Dependencies

pip install -r requirements.txt

Training Algorithms

The repository provides two main training algorithms:

Algorithm Description Location
IQL Independent Q-Learning IQL
CQL Central Q-Learning CQL

!!! warning "Run from src directory" All training commands should be run from the src directory: bash cd src

Independent Q-Learning (IQL)

IQL trains each agent independently with its own Q-table. Each agent learns without explicit knowledge of other agents' policies.

Training

cd src
python -m baselines.IQL.train_iql [OPTIONS]

IQL Training Parameters

Parameter Type Default Description
--episodes int 15000 Number of training episodes
--size int 2 Grid size (size × size)
--alpha float 0.1 Learning rate (Q-value update step size)
--gamma float 0.9 Discount factor (importance of future rewards)
--eps-start float 1.0 Initial exploration rate
--eps-end float 0.05 Final exploration rate
--eps-decay float 0.9995 Exploration decay rate per episode
--seed int 0 Random seed for reproducibility
--save-path str baselines/IQL/iql_qs.npz Path to save trained Q-tables
--log-dir str baselines/IQL/logs/ TensorBoard log directory

IQL Training Examples

# Basic training (small grid, few episodes)
python -m baselines.IQL.train_iql --episodes 5000 --size 5

# Full training with custom parameters
python -m baselines.IQL.train_iql \
    --episodes 20000 \
    --size 8 \
    --alpha 0.1 \
    --gamma 0.99 \
    --eps-start 1.0 \
    --eps-end 0.01 \
    --seed 42

# Quick test run
python -m baselines.IQL.train_iql --episodes 1000 --size 3 --seed 123

IQL Testing

After training, test the learned policy:

python -m baselines.IQL.test_iql [OPTIONS]

IQL Testing Parameters

Parameter Type Default Description
--file str baselines/IQL/iql_qs.npz Path to trained Q-tables
--size int 15 Grid size (should match training)
--episodes int 3 Number of test episodes
--pause float 0.05 Pause between steps (seconds)
--max-steps int 100 Maximum steps per episode
--seed int None Random seed

IQL Testing Examples

# Basic testing with visualization
python -m baselines.IQL.test_iql --size 5 --episodes 3

# Test specific Q-table file
python -m baselines.IQL.test_iql \
    --file baselines/IQL/iql_qs.npz \
    --size 8 \
    --episodes 5 \
    --pause 0.1

# Slower visualization for observation
python -m baselines.IQL.test_iql --size 5 --episodes 1 --pause 0.5

Central Q-Learning

CQL uses a centralized Q-table that considers the joint state of all agents. This allows for coordinated behavior but scales exponentially with agents.

Training

cd src
python -m baselines.CQL.cql_train [OPTIONS]

CQL Training Parameters

Parameter Type Default Description
--episodes int 40000 Number of training episodes
--size int 7 Grid size (size × size)
--alpha float 0.25 Learning rate
--gamma float 0.95 Discount factor
--eps-start float 1.0 Initial exploration rate
--eps-end float 0.05 Final exploration rate
--eps-decay float 0.99995 Exploration decay rate
--seed int 0 Random seed
--predators int 2 Number of predator agents
--preys int 2 Number of prey agents
--save-path str CQL Directory to save Q-tables
--log-dir str baselines/CQL/logs/ TensorBoard log directory
--max-table-gb float 16.0 Maximum Q-table memory (GB)

CQL Training Examples

# Basic testing with visualization
python -m baselines.IQL.test_iql --size 5 --episodes 3

# Test specific Q-table file
python -m baselines.IQL.test_iql \
    --file baselines/IQL/iql_qs.npz \
    --size 8 \
    --episodes 5 \
    --pause 0.1

# Slower visualization for observation
python -m baselines.IQL.test_iql --size 5 --episodes 1 --pause 0.5

CQL Testing

python -m baselines.CQL.test_cql [OPTIONS]

Parameters Gidelines

Choosing Grid Size

Size Cells Recommended For
3-5 9-25 Quick testing, debugging
6-8 36-64 Standard experiments
10+ 100+ Large-scale experiments

warning

"Memory Usage" CQL's Q-table size grows exponentially with grid size and number of agents. For large grids, use IQL or reduce the number of agent

Choosing Learning Rate (Alpha)

Value Effect
0.01 - 0.05 Slow, stable learning
0.1 - 0.2 Balanced (recommended)
0.3+ Fast but potentially unstable

Choosing Discount Factor (Gamma)

Choosing Exploration Parameters

Parameter Recommended Effect
eps-start 1.0 Full exploration initially
eps-end 0.01 - 0.1 Minimal exploration at end
eps-decay 0.999 - 0.9999 Slower decay = more exploration

Saved Files

IQL Output Files

File Description
baselines/IQL/iql_qs.npz Trained Q-tables
baselines/IQL/logs/ TensorBoard logs

CQL Output Files

File Description
baselines/CQL/*.npz Trained Q-tables
baselines/CQL/logs/ TensorBoard logs