No cloud API bills. No vendor lock-in. Run, tune, and benchmark local LLMs on your own hardware. Choose the plan that fits your needs.
Pricing FAQ
Do I need a GPU?
No. The tool works on pure CPU (Intel, AMD). If you have a GPU (NVIDIA CUDA, AMD ROCm, Apple Metal), we auto-detect and optimize for it. GPU benchmarking is typically 10-50x faster than CPU depending on the model size.
What does it cost to run benchmarks?
Hobby plan is free forever. Pro ($19/month) includes unlimited benchmarks on your hardware. No per-benchmark fees, no cloud inference charges, no surprise costs. You own the compute.
Can I switch plans or cancel anytime?
Yes. Monthly billing with no lock-in. Downgrade, upgrade, or cancel anytime. All your benchmark results, profiles, and exports are yours to keep as CSV/JSON.
What models do you support?
Any GGUF, PyTorch, or ONNX model. We've pre-configured Llama, Mistral, Qwen, Phi, Deepseek, TinyLlama, and 100+ others. Upload custom models via file or URL.
Is my data private?
Completely. Benchmarks run locally on your machine. Nothing leaves your disk unless you explicitly export results. Zero telemetry, zero tracking, zero phone-home calls. What happens on your hardware stays on your hardware.
Can I use Pro on multiple machines?
Yes. One Pro subscription covers all your personal machines. Login with the same account on multiple devices. Team plan is for shared workspaces with permission management.
Do you offer discounts for annual billing?
Yes. Team plan offers 15% off for annual prepayment. Contact sales for enterprise pricing or special agreements.
How does this compare to cloud LLM APIs?
Cloud APIs charge per token or per request (expensive at scale). Local inference costs you once upfront for hardware. After that, inference is free. For research, iteration, or production workflows with high volume, local benchmarking saves thousands monthly.