Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 79 additions & 65 deletions tutorial/01-environments.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,8 +245,8 @@ Think of it like this: You don't run your database in the same process as your w
│ │
└─────────────────┬──────────────────────────────────────────┘
HTTP/JSON (Language-Agnostic)
POST /reset, POST /step, GET /state
OpenEnv protocol over WebSocket
(persistent session, language-agnostic)
┌─────────────────▼──────────────────────────────────────────┐
│ DOCKER CONTAINER │
Expand All @@ -262,15 +262,17 @@ Think of it like this: You don't run your database in the same process as your w
```

!!! info "Key Insight"
You never see HTTP details - just clean Python methods!
You never see the wire protocol - just clean Python methods!

```python
env.reset() # Under the hood: HTTP POST to /reset
env.step(...) # Under the hood: HTTP POST to /step
env.state() # Under the hood: HTTP GET to /state
await env.reset() # Under the hood: a reset message over WebSocket
await env.step(...) # Under the hood: a step message over WebSocket
await env.state() # Under the hood: a state message over WebSocket
```

The magic? OpenEnv handles all the plumbing. You focus on RL! ✨
`EnvClient` is async. In a notebook you can `await` directly; in a plain script
use `asyncio.run(...)` or the synchronous `.sync()` wrapper. The magic? OpenEnv
handles all the plumbing. You focus on RL! ✨

---

Expand Down Expand Up @@ -332,7 +334,7 @@ src/envs/your_env/
│ (Action, Observation, State)
├── 📱 client.py ← What YOU import
│ (HTTPEnvClient implementation)
│ (EnvClient implementation)
└── 🖥️ server/
├── environment.py ← Game/simulation logic
Expand Down Expand Up @@ -396,26 +398,32 @@ The client (`envs/openspiel_env/client.py`) inherits from `EnvClient` and declar
```python
class OpenSpielEnv(EnvClient[OpenSpielAction, OpenSpielObservation, OpenSpielState]):
def _step_payload(self, action: OpenSpielAction) -> dict:
"""Convert typed action to JSON for HTTP."""
"""Convert typed action to JSON for the step message."""
return {
"action_id": action.action_id,
"game_name": action.game_name,
"game_params": action.game_params,
}

def _parse_result(self, payload: dict) -> StepResult:
"""Parse HTTP JSON response into typed observation."""
"""Parse the server response into a typed observation."""
return StepResult(
observation=OpenSpielObservation(...),
reward=payload["reward"],
done=payload["done"],
)

def _parse_state(self, payload: dict) -> OpenSpielState:
"""Parse the server response into a typed state."""
return OpenSpielState(...)
```

Usage is the same as any OpenEnv environment:
Usage is the same as any OpenEnv environment. `EnvClient` is async, so we use the
`.sync()` wrapper here for a plain-script feel (in a notebook you can `await` the
async client directly instead):

```python
env = OpenSpielEnv(base_url="http://localhost:8000")
env = OpenSpielEnv(base_url="http://localhost:8000").sync()

result = env.reset() # StepResult[OpenSpielObservation]
result = env.step(OpenSpielAction(action_id=2, game_name="catch"))
Expand All @@ -433,26 +441,25 @@ from envs.openspiel_env.models import (
OpenSpielObservation,
OpenSpielState
)
from dataclasses import fields

print("="*70)
print(" 🎮 OPENSPIEL INTEGRATION - TYPE-SAFE MODELS")
print("="*70)

print("\n📤 OpenSpielAction (what you send):")
print(" " + "─" * 64)
for field in fields(OpenSpielAction):
print(f" • {field.name:20s} : {field.type}")
for name, field in OpenSpielAction.model_fields.items():
print(f" • {name:20s} : {field.annotation}")

print("\n📥 OpenSpielObservation (what you receive):")
print(" " + "─" * 64)
for field in fields(OpenSpielObservation):
print(f" • {field.name:20s} : {field.type}")
for name, field in OpenSpielObservation.model_fields.items():
print(f" • {name:20s} : {field.annotation}")

print("\n📊 OpenSpielState (episode metadata):")
print(" " + "─" * 64)
for field in fields(OpenSpielState):
print(f" • {field.name:20s} : {field.type}")
for name, field in OpenSpielState.model_fields.items():
print(f" • {name:20s} : {field.annotation}")

print("\n" + "="*70)
print("\n💡 Type safety means:")
Expand Down Expand Up @@ -507,13 +514,13 @@ print(" ✅ Self-documenting code\n")

### How the Client Works

The client **inherits from HTTPEnvClient** and implements 3 methods:
The client **inherits from `EnvClient`** and implements 3 methods:

1. `_step_payload()` - Convert action → JSON
2. `_parse_result()` - Parse JSON → typed observation
3. `_parse_state()` - Parse JSON → state

That's it! The base class handles all HTTP communication.
That's it! The base class handles all the async WebSocket communication.

---

Expand Down Expand Up @@ -585,28 +592,27 @@ from envs.openspiel_env.models import (
OpenSpielObservation,
OpenSpielState
)
from dataclasses import fields

print("🎮 " + "="*64 + " 🎮")
print(" ✅ Importing Real OpenSpiel Environment!")
print("🎮 " + "="*64 + " 🎮\n")

print("📦 What we just imported:")
print(" • OpenSpielEnv - HTTP client for OpenSpiel games")
print(" • OpenSpielEnv - async client (EnvClient) for OpenSpiel games")
print(" • OpenSpielAction - Type-safe actions")
print(" • OpenSpielObservation - Type-safe observations")
print(" • OpenSpielState - Episode metadata\n")

print("📋 OpenSpielObservation fields:")
print(" " + "─" * 60)
for field in fields(OpenSpielObservation):
print(f" • {field.name:25s} : {field.type}")
for name, field in OpenSpielObservation.model_fields.items():
print(f" • {name:25s} : {field.annotation}")

print("\n" + "="*70)
print("\n💡 This is REAL OpenEnv code - used in production!")
print(" • Wraps 6 OpenSpiel games (Catch, Tic-Tac-Toe, Poker, etc.)")
print(" • Type-safe actions and observations")
print(" • Works via HTTP (we'll see that next!)\n")
print(" • Talks to the server over WebSocket (we'll see that next!)\n")
```

**Output:**
Expand All @@ -616,7 +622,7 @@ print(" • Works via HTTP (we'll see that next!)\n")
🎮 ================================================================ 🎮

📦 What we just imported:
• OpenSpielEnv - HTTP client for OpenSpiel games
• OpenSpielEnv - async client (EnvClient) for OpenSpiel games
• OpenSpielAction - Type-safe actions
• OpenSpielObservation - Type-safe observations
• OpenSpielState - Episode metadata
Expand All @@ -637,7 +643,7 @@ print(" • Works via HTTP (we'll see that next!)\n")
💡 This is REAL OpenEnv code - used in production!
• Wraps 6 OpenSpiel games (Catch, Tic-Tac-Toe, Poker, etc.)
• Type-safe actions and observations
Works via HTTP (we'll see that next!)
Talks to the server over WebSocket (we'll see that next!)
```

---
Expand Down Expand Up @@ -772,7 +778,7 @@ print(" • Work with ANY OpenSpiel game that exposes these!\n")

Let's run **50 episodes** for each policy against **REAL OpenSpiel** and see who wins!

This is production code - every action is an HTTP call to the OpenSpiel server!
This is production code - every action is a WebSocket message to the OpenSpiel server!

```python
def evaluate_policies(env, num_episodes=50):
Expand Down Expand Up @@ -821,7 +827,7 @@ def evaluate_policies(env, num_episodes=50):
print(" • Learning (~85%): Improves over time 📈")
print("\n🎓 This is Reinforcement Learning + OpenEnv in action:")
print(" 1. We USED existing OpenSpiel environment (didn't build it)")
print(" 2. Type-safe communication over HTTP")
print(" 2. Type-safe communication over WebSocket")
print(" 3. Same code works for ANY OpenSpiel game")
print(" 4. Production-ready architecture\n")

Expand All @@ -842,10 +848,10 @@ In Parts 6-8, we **USED** the existing OpenSpiel Catch environment:
|-------------|--------------|
| **Imported** | OpenSpielEnv client (pre-built) |
| **Started** | OpenSpiel server via uvicorn |
| **Connected** | HTTP client to server |
| **Connected** | Async client over WebSocket |
| **Played** | Real OpenSpiel Catch game |

**🎯 This is production code!** Every action was an HTTP call to a real OpenSpiel environment.
**🎯 This is production code!** Every action was a WebSocket message to a real OpenSpiel environment.

### 🎮 6 Games Available - Same Interface!

Expand Down Expand Up @@ -892,7 +898,7 @@ server_process = subprocess.Popen(
)

# Same client works!
client = OpenSpielEnv(base_url="http://localhost:8000")
client = OpenSpielEnv(base_url="http://localhost:8000").sync()
result = client.reset() # Now playing Tic-Tac-Toe!
```

Expand All @@ -908,79 +914,87 @@ Want to wrap your own environment in OpenEnv? Here's how:

### Step 1: Define Types (`models.py`)

Actions, observations, and state are **Pydantic models**. The base classes already
provide the common fields — `Observation` has `done`, `reward`, `metadata`, and
`State` has `episode_id`, `step_count` — so you only add what's specific to your env.

```python
from dataclasses import dataclass
from core.env_server import Action, Observation, State
from typing import List
from openenv.core.env_server import Action, Observation, State
from pydantic import Field

@dataclass
class YourAction(Action):
action_value: int
# Add your action fields
action_value: int # add your action fields

@dataclass
class YourObservation(Observation):
state_data: List[float]
done: bool
reward: float
# Add your observation fields
# `done`, `reward`, and `metadata` are inherited from Observation
state_data: List[float] # add your observation fields

@dataclass
class YourState(State):
episode_id: str
step_count: int
# Add your state fields
# `episode_id` and `step_count` are inherited from State
score: int = 0 # add your state fields
```

### Step 2: Implement Environment (`server/environment.py`)

```python
from core.env_server import Environment
from openenv.core.env_server import Environment

class YourEnvironment(Environment):
def reset(self) -> Observation:
def reset(self, seed=None, episode_id=None, **kwargs) -> YourObservation:
# Initialize your game/simulation
return YourObservation(...)
def step(self, action: Action) -> Observation:

def step(self, action, **kwargs) -> YourObservation:
# Execute action, update state
return YourObservation(...)

@property
def state(self) -> State:
def state(self) -> YourState:
return self._state
```

### Step 3: Create Client (`client.py`)

The client subclasses the **async** `EnvClient` and implements 3 hooks. Callers
`await` `reset()`/`step()`/`state()` (or use the `.sync()` wrapper).

```python
from core.http_env_client import HTTPEnvClient
from core.types import StepResult
from openenv.core.env_client import EnvClient
from openenv.core.client_types import StepResult

class YourEnv(HTTPEnvClient[YourAction, YourObservation]):
class YourEnv(EnvClient[YourAction, YourObservation, YourState]):
def _step_payload(self, action: YourAction) -> dict:
"""Convert action to JSON"""
"""Convert action to the JSON step message."""
return {"action_value": action.action_value}

def _parse_result(self, payload: dict) -> StepResult:
"""Parse JSON to observation"""
"""Parse the server response into a typed observation."""
return StepResult(
observation=YourObservation(...),
reward=payload['reward'],
done=payload['done']
)

def _parse_state(self, payload: dict) -> YourState:
return YourState(...)
```

### Step 4: Create Server (`server/app.py`)

Pass the environment **class** (a factory) plus the action and observation types.
OpenEnv builds the WebSocket + HTTP endpoints for you.

```python
from core.env_server import create_fastapi_app
from openenv.core.env_server import create_app
from .your_environment import YourEnvironment

env = YourEnvironment()
app = create_fastapi_app(env)
app = create_app(
YourEnvironment,
YourAction,
YourObservation,
env_name="your_env",
)

# That's it! OpenEnv creates all endpoints for you.
```
Expand Down Expand Up @@ -1040,7 +1054,7 @@ OpenEnv includes 3 complete examples:

- Client-server separation
- Type-safe contracts
- HTTP communication layer
- WebSocket communication layer

✅ **Production Patterns**

Expand All @@ -1063,7 +1077,7 @@ OpenEnv includes 3 complete examples:

- Define type-safe models
- Implement Environment class
- Create HTTPEnvClient
- Subclass EnvClient

✅ **Testing & Debugging**

Expand Down
2 changes: 1 addition & 1 deletion tutorial/02-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@ my_env/
│ ├── environment.py # Your environment logic
│ └── Dockerfile
├── models.py # Action/Observation types
├── client.py # HTTP client
├── client.py # Client (EnvClient subclass)
├── openenv.yaml # Manifest
└── pyproject.toml
```
Expand Down
Loading