RL Environments for training & benchmarking AI agents.
We build simulated . Your agent practices on them, gets scored, and improves over time.
Book a callWhy Theta
Built for real-world agent training.
Scores what actually matters
We measure real task completion, not surface metrics. Did the refund go through? Was the ticket resolved correctly? Your agent gets credit for outcomes, not guesses.
Trains on the edge cases
Edge cases break agents in production. Our environments inject failures, timeouts, and messy data your agent will actually face — so it learns to handle them.
Built for your workflow
Not a generic sandbox. We replicate the exact software your agent operates in — same forms, same states, same quirks. Train on what you'll deploy to.
SDK
Start training in minutes.
train.py
import thetabench
env = thetabench.make("shopify-admin", task_id="prod-001")
obs, info = env.reset()
# gymnasium-compatibleWorks with Stable Baselines, RLlib, and anything that speaks Gymnasium.
Blog(6)
Ready to train your agent?
Get in touch and we'll help you get started with the right environment for your use case.



