Deploy machine learning models at scale. Pay per second of compute. No infrastructure to manage.
import infergrove
# Run a model with one line
model = infergrove.models.get("stability/sdxl")
output = model.predict(
prompt="a serene mountain lake at sunset",
width=1024,
height=1024
)
# output → "https://inference.infergrove.com/out/abc123.png"
Thousands of open-source models ready to run. Browse by category or search for what you need.
Use our Python SDK, Node.js client, or call the HTTP API directly from any language.
# Python SDK
import infergrove
# Initialize the client
client = infergrove.Client(api_token="r8_your_token")
# Run a prediction
output = client.run(
"stability/sdxl:latest",
input={
"prompt": "a cyberpunk cityscape at night, neon lights",
"negative_prompt": "blurry, low quality",
"width": 1024,
"height": 1024,
"num_inference_steps": 30,
"guidance_scale": 7.5
}
)
# Stream results as they arrive
for event in output:
print(event)
# → "https://inference.infergrove.com/out/img_001.png"
Join 200,000+ developers running AI models on InferGrove
Three ways to run AI on InferGrove. From quick experiments to production deployments.
Run open-source models with a single API call. No setup, no GPUs to manage. Choose from thousands of community models.
import infergrove
output = infergrove.run(
"stability/sdxl",
input={
"prompt": "an astronaut riding a horse",
"num_outputs": 4,
"guidance_scale": 7.5,
"num_inference_steps": 30
}
)
# Returns list of image URLs
for url in output:
print(url)
Train models on your own data. Create custom versions optimized for your specific use case and brand.
training = infergrove.trainings.create(
model="stability/sdxl",
input={
"input_images": "https://my-data.zip",
"token_string": "TOK",
"max_train_steps": 1000,
"learning_rate": 1e-6
},
destination="yourname/custom-sdxl"
)
# Monitor training progress
training.reload()
print(training.status) # "processing"
Package any model with Cog and deploy it to InferGrove's auto-scaling infrastructure.
# Define your model
from cog import BasePredictor, Input
class Predictor(BasePredictor):
def setup(self):
self.model = load_model()
def predict(self,
image: Path = Input(desc="Input"),
) -> Path:
return self.model(image)
# Deploy: $ infergrove push my-model
Get real-time updates as predictions run. Perfect for LLMs, video generation, and long-running tasks.
# Stream tokens from an LLM
for token in infergrove.stream(
"meta/llama-3.1-70b",
input={"prompt": "Explain quantum computing"}
):
print(token, end="")
# Or use webhooks for async
prediction = infergrove.predictions.create(
model="stability/sdxl",
input={"prompt": "..."},
webhook="https://your-app.com/webhook"
)
Built by developers, for developers. Every API decision optimized for simplicity and power.
Type-safe client with async support, streaming, and automatic retries.
pip install infergrove
Full TypeScript support with ESM and CommonJS compatibility.
npm install infergrove
Simple HTTP endpoints. Works from any language or platform.
api.infergrove.com/v1
From zero to your first AI prediction in three simple steps.
$ pip install infergrove
$ export INFERGROVE_API_TOKEN=r8_your_token_here
import infergrove
output = infergrove.run("stability/sdxl", input={"prompt": "hello world"})
print(output)
No upfront costs. No minimum commitments. Scale to zero when you're not running predictions.
| GPU | VRAM | Price / second | Price / hour | Best for |
|---|---|---|---|---|
| NVIDIA T4 | 16 GB | $0.000225 | $0.81 | Lightweight inference, testing |
| NVIDIA L4 | 24 GB | $0.000350 | $1.26 | Efficient inference, small models |
| NVIDIA L40S | 48 GB | $0.000725 | $2.61 | Image generation, medium models |
| NVIDIA A40 | 48 GB | $0.000575 | $2.07 | Balanced performance, training |
| NVIDIA A100 | 80 GB | $0.001150 | $4.14 | Large models, fine-tuning |
| NVIDIA H100 | 80 GB | $0.001850 | $6.66 | LLMs, video generation |
Get started with limited free predictions every day. No credit card required.
Spend over $1,000/month? Contact us for custom pricing and committed use discounts.
Dedicated clusters, custom SLAs, and priority support for large-scale deployments.
Your models scale up automatically when traffic spikes and scale back to zero when idle. You only pay for what you use.
From open-source models to custom fine-tunes, go from code to production in minutes.
Teams of all sizes trust InferGrove to power their AI features.
"InferGrove cut our inference costs by 60% and eliminated the need for a dedicated ML ops team. We went from managing GPU clusters to a single API call."
"The auto-scaling is incredible. We handle 10x traffic spikes during product launches without any manual intervention. It just works."
"We fine-tuned SDXL on our brand assets in 20 minutes. Now our design team generates on-brand visuals instantly. Game changer."
From creative tools to production pipelines, developers are building incredible things with InferGrove.
Whether you're building a startup or scaling enterprise AI, InferGrove adapts to your needs.
Build image editors, design assistants, and content creation platforms powered by state-of-the-art generative models.
Deploy medical imaging models, clinical NLP, and diagnostic assistants with HIPAA-compliant infrastructure.
Product image generation, virtual try-on, personalized recommendations, and automated product descriptions.
Generate game assets, NPC dialogue, procedural content, and real-time voice synthesis for immersive experiences.
Add AI features to iOS and Android apps without bundling large models. Low-latency API calls from anywhere.
Run experiments at scale, iterate on model architectures, and share reproducible results with the community.
Deploy AI at scale with enterprise-grade security, compliance, and support.
InferGrove is built on open-source principles. Our model packaging tool, Cog, is fully open source and used by thousands of developers worldwide.
Package your model as a standard Docker container with a simple configuration file. No vendor lock-in — your models run anywhere.
# cog.yaml
build:
python_version: "3.11"
python_packages:
- "torch==2.1.0"
- "transformers==4.36.0"
predict: "predict.py:Predictor"
Get started in minutes. No credit card required for the free tier.