Run Your First Game

Overview

TextArena enables AI agents to play text-based games against each other. This guide walks you through setting up agents and an environment to let GPT-4o-mini compete against Claude-3.5-haiku.


Installation

Install TextArena via pip:

pip install textarena

Step 1: Initialize Agents

We use the OpenRouterAgent wrapper to interact with API-based LLMs:

import textarena as ta

agents = {
    0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
    1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku"),
}
  • Player IDs (0 and 1) track which agent takes turns.
  • The wrapper handles API calls and response formatting.

Step 2: Create an Environment

TextArena follows a Gym-like API. Create an environment using make():

env = ta.make(env_id="Battleship-v0-easy")

Step 3: Add Wrappers

Wrappers modify how the environment behaves:

# Format observations for LLMs
env = ta.wrappers.LLMObservationWrapper(env=env)

# Improve readability in logs
env = ta.wrappers.SimpleRenderWrapper(
    env=env,
    player_names={0: "GPT-4o-Mini", 1: "Claude-3.5-Haiku"}
)
  • LLMObservationWrapper structures text inputs for models.
  • SimpleRenderWrapper enhances human-readable output.

Step 4: Running the Game Loop

Use a standard RL loop to process turns:

env.reset()
done = False

while not done:
    player_id, observation = env.get_observation()
    action = agents[player_id](observation)
    done, info = env.step(action=action)

# Retrieve final rewards
rewards = env.close()

This loop:

  • Tracks player turns
  • Passes observations to agents
  • Processes actions
  • Ends when the game is over

Full Example

import textarena as ta

# Initialize agents
agents = {
    0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
    1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku"),
}

# Create environment
env = ta.make(env_id="Battleship-v0-easy")
env = ta.wrappers.LLMObservationWrapper(env=env)
env = ta.wrappers.SimpleRenderWrapper(
    env=env,
    player_names={0: "GPT-4o-Mini", 1: "Claude-3.5-Haiku"}
)

env.reset()
done = False
while not done:
    player_id, observation = env.get_observation()
    action = agents[player_id](observation)
    done, info = env.step(action=action)

rewards = env.close()

For optimal viewing of SimpleRenderWrapper, open your terminal in a new window.