Llm Async — Async multi-provider LLM client for Python

High-performance, async-first LLM client for OpenAI, Claude, Google Gemini, and OpenRouter. Built on top of aiosonic for fast, low-latency HTTP and true asyncio streaming across providers.

The project is hosted on GitHub: https://github.com/sonic182/llm-async

Features

Feature

OpenAI

Claude

Google Gemini

OpenRouter

Chat Completions

Tool Calling

Streaming

Structured Outputs

  • Async-first: Built with asyncio for high-performance, non-blocking operations.

  • Unified interface: Same message/tool/streaming patterns across all providers.

  • Tool Calling: Unified tool definitions with execution helpers.

  • Structured Outputs: JSON schema validation on responses (OpenAI, Google, OpenRouter).

  • Extensible: Add new providers by inheriting from BaseProvider.

  • Tested: Comprehensive test suite with high coverage.

Performance

  • Built on aiosonic for fast, low-overhead async HTTP.

  • True asyncio end-to-end: concurrent requests across providers with minimal overhead.

  • Designed for fast tool-call round-trips and low-latency streaming.

Why llm-async?

  • Async-first performance (aiosonic-based) vs. sync or heavier HTTP stacks.

  • Unified provider interface: same message/tool/streaming patterns across OpenAI, Claude, Gemini, OpenRouter.

  • Structured outputs (OpenAI, Google, OpenRouter) with JSON schema validation.

  • Tool-call round-trip helpers for consistent multi-turn execution.

  • Minimal surface area: easy to extend with new providers via BaseProvider.

Getting Started

Indices and tables