Streaming¶
Understand current text‑only streaming and the preview for sequence‑of‑objects (~4 minutes). Requires Python 3.10+ and pip install alloy-ai
.
Current (stable)¶
- Text‑only streaming.
- Commands that use tools or non‑string outputs do not stream via Alloy.
APIs
from alloy import ask, command
@command
def brainstorm(topic: str) -> str:
return f"Write a short riff about: {topic}"
for chunk in ask.stream("Write a one-liner about cats"):
print(chunk, end="")
for chunk in brainstorm.stream("Alloy examples"):
print(chunk, end="")
Provider guidance: consult the Providers matrix for which models support text streaming.
Preview (vNext)¶
Sequence streaming of typed objects
Design principle: no token/delta event model. When a command returns a sequence (e.g., list[T]
), Alloy yields whole, validated objects of type T
as soon as they are complete.
from dataclasses import dataclass
from alloy import command
@dataclass
class Person:
name: str
email: str
@command(output=list[Person], tools=[...])
def find_people(query: str) -> list[Person]: ...
for person in find_people.stream("kaggle meetup in LA"):
print(person)
# Async variant
async for person in find_people.astream("kaggle meetup in LA"):
print(person)
Semantics
- Supported outputs: Sequence[T]
where T
is a dataclass
or TypedDict
(strict mode applies).
- Yield contract: each item is a fully formed, schema‑validated T
. No partial/delta objects.
- Ordering: preserves the model’s natural order.
- Tools: tool calls run inside the loop as usual; streaming pauses/resumes as needed.
- Errors: by default, validation errors raise and stop iteration.
Provider mapping: OpenAI/Anthropic/Gemini via streamed JSON array assembly; Ollama may fall back to non‑streaming for structured outputs.
Status: behind feature flag (ALLOY_EXPERIMENTAL_STREAMING=1
). See Providers for support.