Llm Async — Async multi-provider LLM client for Python
High-performance, async-first LLM client for OpenAI, Claude, Google Gemini, and OpenRouter. Built on top of aiosonic for fast, low-latency HTTP and true asyncio streaming across providers.
The project is hosted on GitHub: https://github.com/sonic182/llm-async
Features
Feature |
OpenAI |
Claude |
Google Gemini |
OpenRouter |
|---|---|---|---|---|
Chat Completions |
✅ |
✅ |
✅ |
✅ |
Tool Calling |
✅ |
✅ |
✅ |
✅ |
Streaming |
✅ |
✅ |
✅ |
✅ |
Structured Outputs |
✅ |
❌ |
✅ |
✅ |
Async-first: Built with asyncio for high-performance, non-blocking operations.
Unified interface: Same message/tool/streaming patterns across all providers.
Tool Calling: Unified tool definitions with execution helpers.
Structured Outputs: JSON schema validation on responses (OpenAI, Google, OpenRouter).
Extensible: Add new providers by inheriting from
BaseProvider.Tested: Comprehensive test suite with high coverage.
Performance
Built on aiosonic for fast, low-overhead async HTTP.
True asyncio end-to-end: concurrent requests across providers with minimal overhead.
Designed for fast tool-call round-trips and low-latency streaming.
Why llm-async?
Async-first performance (aiosonic-based) vs. sync or heavier HTTP stacks.
Unified provider interface: same message/tool/streaming patterns across OpenAI, Claude, Gemini, OpenRouter.
Structured outputs (OpenAI, Google, OpenRouter) with JSON schema validation.
Tool-call round-trip helpers for consistent multi-turn execution.
Minimal surface area: easy to extend with new providers via
BaseProvider.
Getting Started
Usage
API Reference
Project