Streaming

All providers support streaming via stream=True. The acomplete call returns an async iterator that yields StreamChunk objects as tokens arrive.

import asyncio
from llm_async import OpenAIProvider
from llm_async.models.message import Message

async def main():
    provider = OpenAIProvider(api_key="your-openai-api-key")
    async for chunk in await provider.acomplete(
        model="gpt-4o-mini",
        messages=[Message("user", "Give me a recipe for tortilla española.")],
        stream=True,
    ):
        print(chunk.delta, end="", flush=True)

asyncio.run(main())

The same pattern works for all providers: ClaudeProvider, GoogleProvider, OpenRouterProvider, and OpenAIResponsesProvider.

Example streaming output across providers:

--- OpenAI streaming response ---
Peel and slice potatoes.
Par-cook potatoes briefly.
Whisk eggs with salt and pepper.
Sauté onions until translucent (optional).
Combine potatoes and eggs in a pan and cook until set.
Fold and serve.
--- Claude streaming response ---
Prepare potatoes by peeling and slicing.
Fry or boil until tender.
Beat eggs and season.
Mix potatoes with eggs and cook gently.
Serve warm.

See examples/stream_all_providers.py for a runnable multi-provider streaming demo.