Model Library

All Models

Language

Code

Image

Embedding

Audio

Featured

Popular models

Llama 4 Maverick 400B New

Meta's flagship 400B parameter model with mixture-of-experts architecture. Best-in-class reasoning and instruction following.

400B params 1M context 120 tok/s

Llama 4 Scout 109B Popular

Excellent balance of quality and speed. 109B parameters with 10M token context window. Ideal for production workloads.

109B params 512K context 340 tok/s

DeepSeek V3 685B MoE

DeepSeek's largest model with 685B total parameters. Exceptional at math, coding, and complex reasoning tasks.

685B params 128K context 95 tok/s

Mixtral 8x22B v0.3 Fast

Mistral's mixture-of-experts model. 8 experts with 22B parameters each. Excellent throughput for general tasks.

176B total 64K context 480 tok/s

Qwen 3 72B Instruct Multilingual

Alibaba's 72B model with excellent multilingual capabilities. Strong performance across 30+ languages.

72B params 128K context 410 tok/s

Gemma 3 27B Efficient

Google's efficient 27B model. Punches above its weight class with excellent instruction following and reasoning.

27B params 32K context 620 tok/s

Code

Code generation models

DeepSeek Coder V3 Code

State-of-the-art code generation. Supports 100+ programming languages with fill-in-the-middle capability.

33B params 128K context 520 tok/s

StarCoder 3 15B Code

BigCode's latest model trained on The Stack v3. Excellent for code completion and generation tasks.

15B params 64K context 780 tok/s

CodeLlama 2 70B Code

Meta's code-specialized Llama variant. Excellent at code review, debugging, and complex refactoring tasks.

70B params 100K context 310 tok/s

Image

Image generation models

Stable Diffusion 4 Image

Stability AI's latest diffusion model. Photorealistic image generation with excellent prompt adherence.

1024x1024 1.2s/image $0.003/img

FLUX.2 Pro Image

Black Forest Labs' flagship model. Exceptional text rendering and compositional understanding.

Up to 2048x2048 0.8s/image $0.005/img

SDXL Turbo Fast

Real-time image generation in a single step. Perfect for interactive applications and rapid prototyping.

512x512 0.1s/image $0.001/img

Audio

Audio & speech models

Whisper v4 Audio

OpenAI's latest speech recognition model. Supports 100+ languages with near-human accuracy.

100+ languages Real-time $0.006/min

Bark v2 TTS

High-quality text-to-speech with emotion control. Generate natural-sounding speech in multiple voices.

50+ voices Real-time $0.015/min

MusicGen Pro Music

Generate music from text descriptions. Create background music, jingles, and soundscapes programmatically.

Up to 30s Multiple genres $0.05/gen

Embeddings

Embedding models

BGE-M3 Embed

Multi-lingual, multi-granularity embedding model. Supports dense, sparse, and multi-vector retrieval.

1024 dims 8K context 10K tok/s

E5-Mistral-7B Embed

Instruction-tuned embedding model based on Mistral 7B. Excellent for semantic search and RAG applications.

4096 dims 32K context 5K tok/s

Nomic Embed v2 Embed

Lightweight, high-quality embeddings with Matryoshka representation learning. Flexible dimensionality.

768 dims 8K context 15K tok/s

Popular models

Code generation models

Image generation models

Audio & speech models

Embedding models

Optimized for maximum throughput

Tokens per Second — Output Generation

Can't find your model?