Pricing

No cloud API bills. No vendor lock-in. Run, tune, and benchmark local LLMs on your own hardware. Choose the plan that fits your needs.

Hobby

Free

Forever. No credit card required.

Unlimited local benchmarks
Hardware profiling (CPU/GPU/RAM)
Inference latency & throughput reports
Model comparison (up to 3 simultaneous)
Export results (CSV, JSON)
Community model library

Get Started Free

Pro

$19/month

Billed monthly. Cancel anytime.

Everything in Hobby, plus:
Unlimited model comparisons
Saved hardware profiles (100+)
Tuning recommendations engine
Batch benchmark scheduling
Priority email support
Inference optimization suggestions

Start Free Trial

Team

$99/month

5 team members. Annual billing available.

Everything in Pro, plus:
Team workspace & permissions
Shared benchmark library
REST API (1M calls/month)
Custom model registry
Dedicated Slack support
Custom GGUF optimization

Contact Sales

Feature	Hobby	Pro	Team
Unlimited benchmarks	Yes	Yes	Yes
Model comparisons	3 at once	Unlimited	Unlimited
Hardware profiles saved	Current session	100+	Unlimited
Tuning recommendations	No	Yes	Yes
Team workspace	No	No	Yes (5 users)
API access	No	No	1M calls/month
Support	Docs only	Email (24h)	Slack (2h)

Pricing FAQ

Do I need a GPU?

No. The tool works on pure CPU (Intel, AMD). If you have a GPU (NVIDIA CUDA, AMD ROCm, Apple Metal), we auto-detect and optimize for it. GPU benchmarking is typically 10-50x faster than CPU depending on the model size.

What does it cost to run benchmarks?

Hobby plan is free forever. Pro ($19/month) includes unlimited benchmarks on your hardware. No per-benchmark fees, no cloud inference charges, no surprise costs. You own the compute.

Can I switch plans or cancel anytime?

Yes. Monthly billing with no lock-in. Downgrade, upgrade, or cancel anytime. All your benchmark results, profiles, and exports are yours to keep as CSV/JSON.

What models do you support?

Any GGUF, PyTorch, or ONNX model. We've pre-configured Llama, Mistral, Qwen, Phi, Deepseek, TinyLlama, and 100+ others. Upload custom models via file or URL.

Is my data private?

Completely. Benchmarks run locally on your machine. Nothing leaves your disk unless you explicitly export results. Zero telemetry, zero tracking, zero phone-home calls. What happens on your hardware stays on your hardware.

Can I use Pro on multiple machines?

Yes. One Pro subscription covers all your personal machines. Login with the same account on multiple devices. Team plan is for shared workspaces with permission management.

Do you offer discounts for annual billing?

Yes. Team plan offers 15% off for annual prepayment. Contact sales for enterprise pricing or special agreements.

How does this compare to cloud LLM APIs?

Cloud APIs charge per token or per request (expensive at scale). Local inference costs you once upfront for hardware. After that, inference is free. For research, iteration, or production workflows with high volume, local benchmarking saves thousands monthly.

Feature Comparison

Pricing FAQ