Pay only for what your agents use. Scale from hobby projects to production deployments without surprises.
After your monthly token allocation, you pay only for overages. No setup fees, no minimum commitment. All tiers get the same infrastructure quality and performance.
| Component | Cost | Included in Production Tier |
|---|---|---|
| Input tokens (LLM inference) | $0.10 per 1M tokens | 25M/month (250K included daily) |
| Output tokens (LLM inference) | $0.30 per 1M tokens | Included with input allocation |
| Tool execution (function calls) | $0.02 per 1K calls | Unlimited |
| Memory storage (agent state) | $0.01 per GB/month | 100GB/month included |
| Webhook invocations | Free | Unlimited |
| Custom model bring-your-own | Free | Unlimited (BYOM credits) |
See what different agent workloads cost in production. All examples assume your agents run 24/7.
Lightweight conversational agent. 1K requests/day, 500 tokens per request average.
15M tokens/month = included in Production tier
Calls external APIs, refines search queries. 500 requests/day, 2K tokens/request.
30M tokens/month (+5M overage at $0.15)
20 specialized agents orchestrating complex workflows. 50K requests/day, 4K tokens avg.
200M tokens/month (+175M overage at $0.17)
All tiers get access to the same core infrastructure, APIs, and monitoring tools. The difference is support level, deployment options, and included token allocation.
No. Failed requests (errors, timeouts, invalid tool calls) do not consume tokens and are not billed. You only pay for successful inference and tool execution.
Yes. Set a spend limit in your account settings and we will soft-stop new requests once you reach it. Existing in-flight requests complete normally. Email us to raise the cap.
Production tier: 15% off if you commit annually. Enterprise: custom discounts based on volume. Contact sales to discuss multi-year options.
Overage tokens are billed at the per-1M-token rate at the end of the month. We send daily warnings when you cross 80% of your allocation so you can plan ahead.
Not currently. We focus on inference for pre-trained models. If you need custom model support, contact Enterprise sales.
Production tier runs in US-East and EU-West by default. Enterprise customers can deploy in custom regions, on-premises, or hybrid setups.
Sign up in under 2 minutes and get a free API key. No credit card required for Starter tier. Production and Enterprise tiers activate after adding billing.
Questions about pricing? Email our sales team or join our community Slack for peer advice.