Pay only for what you use. No hidden fees, no minimum commitments. Start free and scale to billions of tokens.
For experimentation and prototyping
For individual developers and startups
For growing teams and companies
For large-scale production deployments
Prices shown per 1 million tokens. Output tokens are typically 3-4x the cost of input tokens.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Throughput |
|---|---|---|---|---|
| Llama 4 Maverick 400B | $0.80 | $2.40 | 1M tokens | 120 tok/s |
| Llama 4 Scout 109B | $0.20 | $0.60 | 512K tokens | 340 tok/s |
| DeepSeek V3 685B | $1.00 | $3.00 | 128K tokens | 95 tok/s |
| Mixtral 8x22B v0.3 | $0.12 | $0.36 | 64K tokens | 480 tok/s |
| Qwen 3 72B Instruct | $0.15 | $0.45 | 128K tokens | 410 tok/s |
| Gemma 3 27B | $0.07 | $0.21 | 32K tokens | 620 tok/s |
| Llama 4 Scout 17B | $0.04 | $0.12 | 128K tokens | 890 tok/s |
| Stable Diffusion 4 | $0.003 per image (1024x1024) | — | 1.2s/image | |
| FLUX.2 Pro | $0.005 per image (1024x1024) | — | 0.8s/image | |
| BGE-M3 (Embeddings) | $0.01 | — | 8K tokens | 10K tok/s |
For workloads that require guaranteed capacity and isolation. Billed hourly with monthly commitment discounts.
| Configuration | GPUs | Memory | On-Demand ($/hr) | Reserved 1yr ($/hr) |
|---|---|---|---|---|
| Small | 8x H100 | 640 GB | $24.00 | $16.80 (30% off) |
| Medium | 32x H100 | 2.5 TB | $92.00 | $64.40 (30% off) |
| Large | 64x H100 | 5 TB | $180.00 | $126.00 (30% off) |
| XL | 128x H100 | 10 TB | $350.00 | $245.00 (30% off) |
| H200 Small | 8x H200 | 1.1 TB | $36.00 | $25.20 (30% off) |
| H200 Large | 64x H200 | 9 TB | $272.00 | $190.40 (30% off) |
All dedicated clusters include: managed infrastructure, automatic failover, 24/7 monitoring, and dedicated support. Custom configurations available for Enterprise customers.
See exactly what's included in each plan.
| Feature | Free | Pro | Team | Enterprise |
|---|---|---|---|---|
| Models available | 50+ | 200+ | 200+ | 200+ & custom |
| Rate limit | 10 req/s | 1,000 req/s | 5,000 req/s | Unlimited |
| Daily request limit | 1,000 | Unlimited | Unlimited | Unlimited |
| Fine-tuning | — | ✓ | ✓ | ✓ (priority) |
| Batch processing | — | ✓ (50% off) | ✓ (50% off) | ✓ (custom) |
| AI Agents | Basic | ✓ | ✓ | ✓ (advanced) |
| Function calling | ✓ | ✓ | ✓ | ✓ |
| Structured output | ✓ | ✓ | ✓ | ✓ |
| Team management | — | — | ✓ | ✓ |
| SSO / SAML | — | — | ✓ | ✓ |
| VPC peering | — | — | — | ✓ |
| Dedicated clusters | — | — | — | ✓ |
| SLA | — | 99.5% | 99.9% | 99.99% |
| Support | Community | Email (24h) | Priority (4h) | 24/7 dedicated |
| SOC 2 report | — | — | ✓ | ✓ |
| HIPAA BAA | — | — | — | ✓ |
Get 1,000 free requests per day. No credit card required. Upgrade when you're ready.