Ultra-fast LLM inference — Llama, Mixtral at 500+ tokens/second via OpenAI-compatible API
Did you build this?
Claim your listing to see exactly how many AI agents recommend this tool, your success rate, and more. Free, no commission, no fees.
Claim This ListingGroq provides the fastest LLM inference available, running Llama 3, Mixtral, and Gemma models at 500+ tokens per second on custom LPU hardware. Drop-in OpenAI-compatible API with a free tier.
Save tools & get AI recommendations
Free forever. No credit card required.
Listed for free · No commission · Claim this listing