Pricing - Software for Agents Infrastructure

Starter

Free (up to 10K requests/mo)

100K tokens/month included
1 concurrent agent
Community support
Basic analytics
HTTP API + WebSocket

Start Building

Production

$99/month

25M tokens/month included
Unlimited agents
Email + Slack support
Advanced analytics & monitoring
Multi-region deployment
99.9% uptime SLA
Team management (up to 5 seats)

Get Started

Enterprise

Custom

Unlimited tokens
Unlimited agents & deployments
24/7 phone + dedicated engineer
Custom integrations & APIs
On-prem or hybrid deployment
99.99% uptime SLA
Unlimited team seats

Talk to Sales

Usage-Based Pricing

After your monthly token allocation, you pay only for overages. No setup fees, no minimum commitment. All tiers get the same infrastructure quality and performance.

Component	Cost	Included in Production Tier
Input tokens (LLM inference)	$0.10 per 1M tokens	25M/month (250K included daily)
Output tokens (LLM inference)	$0.30 per 1M tokens	Included with input allocation
Tool execution (function calls)	$0.02 per 1K calls	Unlimited
Memory storage (agent state)	$0.01 per GB/month	100GB/month included
Webhook invocations	Free	Unlimited
Custom model bring-your-own	Free	Unlimited (BYOM credits)

Estimated Monthly Costs

See what different agent workloads cost in production. All examples assume your agents run 24/7.

Simple Chatbot

Lightweight conversational agent. 1K requests/day, 500 tokens per request average.

~$14/month

15M tokens/month = included in Production tier

Data Researcher

Calls external APIs, refines search queries. 500 requests/day, 2K tokens/request.

~$89/month

30M tokens/month (+5M overage at $0.15)

Multi-Agent Enterprise

20 specialized agents orchestrating complex workflows. 50K requests/day, 4K tokens avg.

~$580/month

200M tokens/month (+175M overage at $0.17)

What's Included

All tiers get access to the same core infrastructure, APIs, and monitoring tools. The difference is support level, deployment options, and included token allocation.

Do you charge for failed requests?

No. Failed requests (errors, timeouts, invalid tool calls) do not consume tokens and are not billed. You only pay for successful inference and tool execution.

Can I cap my monthly spend?

Yes. Set a spend limit in your account settings and we will soft-stop new requests once you reach it. Existing in-flight requests complete normally. Email us to raise the cap.

Do you offer annual or multi-year discounts?

Production tier: 15% off if you commit annually. Enterprise: custom discounts based on volume. Contact sales to discuss multi-year options.

What happens if I exceed my monthly allocation?

Overage tokens are billed at the per-1M-token rate at the end of the month. We send daily warnings when you cross 80% of your allocation so you can plan ahead.

Do you support custom model training or fine-tuning?

Not currently. We focus on inference for pre-trained models. If you need custom model support, contact Enterprise sales.

Are there any data residency options?

Production tier runs in US-East and EU-West by default. Enterprise customers can deploy in custom regions, on-premises, or hybrid setups.

How to Get Started

Sign up in under 2 minutes and get a free API key. No credit card required for Starter tier. Production and Enterprise tiers activate after adding billing.

Create a Free Account

Questions about pricing? Email our sales team or join our community Slack for peer advice.