Run AI with an API

Deploy machine learning models at scale. Pay per second of compute. No infrastructure to manage.

Get started free Read the docs

✓ No credit card required ✓ Free tier included ✓ Scale to zero

import infergrove

# Run a model with one line
model = infergrove.models.get("stability/sdxl")
output = model.predict(
    prompt="a serene mountain lake at sunset",
    width=1024,
    height=1024
)

# output → "https://inference.infergrove.com/out/abc123.png"

Explore models

Thousands of open-source models ready to run. Browse by category or search for what you need.

Image Generation Speech & Audio Video Large Language Models Image-to-Image Upscaling 3D Music

🎨

stability/sdxl

Stability AI

▶ 48.2M runs Image Gen

✨

black-forest/flux-pro

Black Forest Labs

▶ 31.7M runs Image Gen

🎙️

openai/whisper

OpenAI

▶ 22.1M runs Speech

🧠

meta/llama-3.1-70b

Meta AI

▶ 18.9M runs LLM

🎬

tencent/hunyuan-video

Tencent

▶ 12.4M runs Video

🔍

nightmareai/real-esrgan

NightmareAI

▶ 9.8M runs Upscaling

🖼️

jagilley/controlnet

jagilley

▶ 7.2M runs Image-to-Image

🎵

meta/musicgen

Meta AI

▶ 5.1M runs Audio

Browse all models →

Works with your stack

Use our Python SDK, Node.js client, or call the HTTP API directly from any language.

Python Node.js HTTP cURL

# Python SDK
import infergrove

# Initialize the client
client = infergrove.Client(api_token="r8_your_token")

# Run a prediction
output = client.run(
    "stability/sdxl:latest",
    input={
        "prompt": "a cyberpunk cityscape at night, neon lights",
        "negative_prompt": "blurry, low quality",
        "width": 1024,
        "height": 1024,
        "num_inference_steps": 30,
        "guidance_scale": 7.5
    }
)

# Stream results as they arrive
for event in output:
    print(event)
# → "https://inference.infergrove.com/out/img_001.png"

How it works

Three ways to run AI on InferGrove. From quick experiments to production deployments.

Run models

Run open-source models with a single API call. No setup, no GPUs to manage. Choose from thousands of community models.

import infergrove

output = infergrove.run(
  "stability/sdxl",
  input={
    "prompt": "an astronaut riding a horse",
    "num_outputs": 4,
    "guidance_scale": 7.5,
    "num_inference_steps": 30
  }
)

# Returns list of image URLs
for url in output:
    print(url)

Fine-tune models

Train models on your own data. Create custom versions optimized for your specific use case and brand.

training = infergrove.trainings.create(
  model="stability/sdxl",
  input={
    "input_images": "https://my-data.zip",
    "token_string": "TOK",
    "max_train_steps": 1000,
    "learning_rate": 1e-6
  },
  destination="yourname/custom-sdxl"
)

# Monitor training progress
training.reload()
print(training.status)  # "processing"

Deploy custom models

Package any model with Cog and deploy it to InferGrove's auto-scaling infrastructure.

# Define your model
from cog import BasePredictor, Input

class Predictor(BasePredictor):
    def setup(self):
        self.model = load_model()

    def predict(self,
        image: Path = Input(desc="Input"),
    ) -> Path:
        return self.model(image)

# Deploy: $ infergrove push my-model

Streaming & webhooks built in

Get real-time updates as predictions run. Perfect for LLMs, video generation, and long-running tasks.

# Stream tokens from an LLM
for token in infergrove.stream(
    "meta/llama-3.1-70b",
    input={"prompt": "Explain quantum computing"}
):
    print(token, end="")

# Or use webhooks for async
prediction = infergrove.predictions.create(
    model="stability/sdxl",
    input={"prompt": "..."},
    webhook="https://your-app.com/webhook"
)

Developer-first experience

Built by developers, for developers. Every API decision optimized for simplicity and power.

🐍

Python SDK

Type-safe client with async support, streaming, and automatic retries.

pip install infergrove

📦

Node.js Client

Full TypeScript support with ESM and CommonJS compatibility.

npm install infergrove

🌐

REST API

Simple HTTP endpoints. Works from any language or platform.

api.infergrove.com/v1

Up and running in 60 seconds

From zero to your first AI prediction in three simple steps.

Install the SDK

$ pip install infergrove

Set your API token

$ export INFERGROVE_API_TOKEN=r8_your_token_here

Run your first prediction

import infergrove
output = infergrove.run("stability/sdxl", input={"prompt": "hello world"})
print(output)

Read the full quickstart guide →

Pay per second of compute

No upfront costs. No minimum commitments. Scale to zero when you're not running predictions.

GPU	VRAM	Price / second	Price / hour	Best for
NVIDIA T4	16 GB	$0.000225	$0.81	Lightweight inference, testing
NVIDIA L4	24 GB	$0.000350	$1.26	Efficient inference, small models
NVIDIA L40S	48 GB	$0.000725	$2.61	Image generation, medium models
NVIDIA A40	48 GB	$0.000575	$2.07	Balanced performance, training
NVIDIA A100	80 GB	$0.001150	$4.14	Large models, fine-tuning
NVIDIA H100	80 GB	$0.001850	$6.66	LLMs, video generation

Free tier

Get started with limited free predictions every day. No credit card required.

Volume discounts

Spend over $1,000/month? Contact us for custom pricing and committed use discounts.

Enterprise

Dedicated clusters, custom SLAs, and priority support for large-scale deployments.

View full pricing details

What developers are saying

Teams of all sizes trust InferGrove to power their AI features.

"InferGrove cut our inference costs by 60% and eliminated the need for a dedicated ML ops team. We went from managing GPU clusters to a single API call."

Sarah Chen

CTO, PixelForge

"The auto-scaling is incredible. We handle 10x traffic spikes during product launches without any manual intervention. It just works."

Marcus Rivera

Lead Engineer, Artisan AI

"We fine-tuned SDXL on our brand assets in 20 minutes. Now our design team generates on-brand visuals instantly. Game changer."

Aisha Patel

VP Design, CreativeStack

Imagine what you can build

From creative tools to production pipelines, developers are building incredible things with InferGrove.

AI Art Generation

Create stunning visuals from text prompts

Video Editing AI

Automate video production workflows

Voice Synthesis

Generate natural speech in any language

Document Analysis

Extract insights from any document type

3D Asset Generation

Create 3D models from text or images

Music Composition

Generate original music and soundscapes

Face Restoration

Enhance and restore old or damaged photos

Code Generation

Build AI-powered developer tools

Style Transfer

Apply artistic styles to any image or video

Built for every use case

Whether you're building a startup or scaling enterprise AI, InferGrove adapts to your needs.

🎨

Creative tools

Build image editors, design assistants, and content creation platforms powered by state-of-the-art generative models.

🏥

Healthcare AI

Deploy medical imaging models, clinical NLP, and diagnostic assistants with HIPAA-compliant infrastructure.

🛒

E-commerce

Product image generation, virtual try-on, personalized recommendations, and automated product descriptions.

🎮

Gaming & entertainment

Generate game assets, NPC dialogue, procedural content, and real-time voice synthesis for immersive experiences.

📱

Mobile apps

Add AI features to iOS and Android apps without bundling large models. Low-latency API calls from anywhere.

🔬

Research & science

Run experiments at scale, iterate on model architectures, and share reproducible results with the community.

⭐ Open Source Stats

12.4k

GitHub Stars

3.2k

Contributors

$ pip install cog
✓ Successfully installed cog-0.9.4
$ cog predict -i prompt="a cat"
Running prediction...
✓ Output saved to output.png

Open source at the core

InferGrove is built on open-source principles. Our model packaging tool, Cog, is fully open source and used by thousands of developers worldwide.

Package your model as a standard Docker container with a simple configuration file. No vendor lock-in — your models run anywhere.

# cog.yaml
build:
  python_version: "3.11"
  python_packages:
    - "torch==2.1.0"
    - "transformers==4.36.0"

predict: "predict.py:Predictor"