Blog

Product updates, engineering deep dives, and community highlights from the InferGrove team.

March 15, 2026

Introducing H100 support: 3x faster LLM inference

We've added NVIDIA H100 GPUs to our fleet, enabling significantly faster inference for large language models.

🚀

March 8, 2026

How we reduced cold start times by 80%

A deep dive into our new model caching architecture that dramatically reduces time-to-first-prediction.

🎨

February 28, 2026

Fine-tuning SDXL: a practical guide

Learn how to create custom image generation models trained on your own data in under 30 minutes.

🌟

February 20, 2026

Community spotlight: 10 creative projects built on InferGrove

Showcasing innovative applications from our community of developers and creators.

📡

February 12, 2026

Streaming predictions with webhooks and SSE

How to build real-time AI features using our streaming prediction API.

📈

February 5, 2026

Scaling to 1 million predictions per day

Architecture patterns and best practices for high-throughput AI applications.