Grid World Environment
This directory contains the implementation of a simple 5x5 Grid World environment, designed to serve two primary purposes within the OpenEnv ecosystem:
- A basic Reinforcement Learning (RL) testbed: Providing a straightforward, deterministic environment for quick prototyping and testing of RL agents.
- A detailed "How-To" guide for building new OpenEnv environments: Demonstrating the architectural patterns, best practices, and core components required to integrate a custom environment into the OpenEnv framework.
π Environment Overview
The Grid World environment features:
- Grid Size: A 5x5 square grid.
- Agent: Starts at position
(0,0)(top-left). - Goal: Fixed at
(4,4)(bottom-right). - Actions:
UP,DOWN,LEFT,RIGHT. - Dynamics: Deterministic. An action always moves the agent one step in the chosen direction, unless it would move off the grid, in which case the agent stays in its current cell.
- Reward Function (Sparse):
-0.1for every step taken (a "living cost" or "step penalty").+1.0for reaching the goal at(4,4). This also terminates the episode.
- Episode Termination: The episode ends when the agent reaches the goal.
Example Gameplay
Imagine the agent trying to find the goal:
- Reset: Agent at
(0,0)βObs(x=0, y=0, reward=0.0, done=False) - Step DOWN: Agent moves to
(1,0)βObs(x=1, y=0, reward=-0.1, done=False) - Step RIGHT: Agent moves to
(1,1)βObs(x=1, y=1, reward=-0.1, done=False) - ...
- Step RIGHT (from 4,3): Agent moves to
(4,4)βObs(x=4, y=4, reward=1.0, done=True)
π οΈ How to Build an OpenEnv Environment: A Detailed Guide
This section explains the structure and key design choices of the Grid World environment.
1. Scaffolding and Configuration
This environment supports multi-mode deployment. It uses pyproject.toml for modern local development (via uv) and a Dockerfile for containerized deployment.
Directory Structure
envs/grid_world_env
βββ server/
β βββ __init__.py # Package initializer for the server side
β βββ app.py # The FastAPI application entry point
β βββ Dockerfile # Container definition (uses requirements.txt)
β βββ grid_world_environment.py # The core environment logic
β βββ requirements.txt # Dependencies for the Docker build
βββ __init__.py # Package initializer for the client side
βββ client.py # Python client for interacting with the env server
βββ models.py # Pydantic data structures (Action, Observation)
βββ openenv.yaml # OpenEnv metadata
βββ pyproject.toml # Project configuration for local dev (uv)
βββ uv.lock # Exact dependency versions (Generated by uv)
βββ README.md
βββ test_grid_world.sh # Integration test script (Docker based)
# Core Components Explained
This section dives into the specific code files that power the **Grid World**, explaining how the **OpenEnv** framework connects the data, logic, and server layers.
---
## 1. `models.py` β *The Data Contract*
This file defines the strict βlanguageβ used for communication between the **Client (RL Agent)** and the **Server**. It relies on **Pydantic** to enforce type safety.
### Key Components
- **`MoveAction(str, Enum)`**
Defines the allowed vocabulary for movement: `UP`, `DOWN`, `LEFT`, `RIGHT`.
Using an `Enum` prevents *magic string* errors (e.g., sending `"up"` instead of `"UP"`).
- **`GridWorldAction(Action)`**
Wraps the movement enum in a standardized **OpenEnv** action structure.
When the server receives a request, **FastAPI** automatically validates that the incoming JSON payload matches this schema.
- **`GridWorldObservation(Observation)`**
Defines exactly what the agent observes from the environment:
- `x`, `y`: Integer coordinates representing the agentβs position
- `reward`: Floating-point value (e.g., `-0.1`, `1.0`)
- `done`: Boolean flag indicating episode termination
> **Note:**
> By inheriting from `pydantic.BaseModel` (via `Observation`), these classes automatically handle JSON serialization and deserialization.
---
## 2. `server/grid_world_environment.py` β *The Logic*
This file contains the βphysics engineβ and rules of the environment. It translates abstract actions into concrete state transitions.
### Core Responsibilities
- **Inheritance**
`GridWorldEnvironment` inherits from `openenv.core.env_server.Environment`, providing the standardized interface required by the OpenEnv server.
- **`__init__` Method**
- Sets static configuration:
- Grid size: `5 Γ 5`
- Goal location: `[4, 4]`
- Initializes the persistent state container.
- **State Persistence (`self._state`)**
- HTTP requests are stateless, so the environment instance must remember the agentβs position between calls.
- `self._state` (an instance of `openenv...State`) tracks:
- `step_count`
- `episode_id`
- `agent_x`, `agent_y`
- **`step()` Logic**
- **Input:** Receives a validated `GridWorldAction`
- **Dynamics:** Applies movement rules and clamps coordinates using
`max(0, min(..., grid_size - 1))` to prevent the agent from leaving the grid
- **Feedback:** Computes a sparse reward:
- `1.0` if `(x, y) == goal`
- `-0.1` otherwise
- Returns a `GridWorldObservation`
---
## 3. `server/app.py` β *The API*
This file is the βglueβ that turns the environment logic into a running web service.
### Key Elements
- **`create_app` Utility**
Instead of manually defining FastAPI routes, this file uses
`openenv.core.env_server.create_app`.
It:
- Binds the environment logic (`GridWorldEnvironment`)
- Connects the data models (`GridWorldAction`, `GridWorldObservation`)
- Automatically generates standard endpoints:
- `/reset`
- `/step`
- `/state`
- `/health`
- **`main()` Entry Point**
Defines a `main()` function that calls `uvicorn.run`.
This is what enables the `server = "..."` script in `pyproject.toml` to start the server.
---
## 4. `server/Dockerfile` β *The Container*
This file defines how the environment is packaged for production or remote deployment.
### Container Setup
- **Base Image**
Builds on `envtorch-base`, ensuring compatible system libraries.
- **Dependencies**
Copies and installs `server/requirements.txt`.
This keeps the Docker image lightweight and focused only on server-side requirements.
- **Execution**
- Exposes port `8000`
- Defines the `CMD` to launch `uvicorn`
The container is ready to accept HTTP requests immediately upon startup.
---
## 5. `pyproject.toml` β *Local Development*
This file enables a modern local development workflow using **uv**.
### Key Sections
- **Project Metadata**
- Package name: `grid_world_env`
- Version information
- **Dependencies**
Lists libraries required for local execution:
- `fastapi`
- `uvicorn`
- `gymnasium`
- `numpy`
- **`[project.scripts]`**
Defines a shortcut command:
```toml
server = "grid_world_env.server.app:main"
# π Getting Started
You can run the environment using **uv** (fastest for development) or **Docker** (best for deployment).
---
## Option 1: Local Development with `uv` (Recommended)
Since this project is configured with `pyproject.toml`, you can run the server instantly.
### Steps
1. **Navigate to the environment folder**
```bash
cd envs/grid_world_env
uv run server
2. ** Visit the live Swagger UI in your Browser
```bash
http://localhost:8000/docs
## Option 2: Docker Integration Test
To build the full container and run the integration test suite (simulating a production deployment):
---
### Steps
1. **Navigate to the root OpenEnv directory**
2. **Run the test script**
```bash
./envs/grid_world_env/test_grid_world.sh
Builds the Docker image
Starts the container
Runs a series of curl requests to verify functionality
Cleans up containers and images after completion
## Conclusion
This Grid World environment serves as the reference implementation for building environments in OpenEnv. By following this pattern, custom environments remain:
Portable across local and containerized setups
Strictly typed through Pydantic models
Deployment-ready for development, testing, and production workflows
---