AI glossary
Streaming
Returning the model's response token-by-token as it's generated, instead of waiting for the full reply. Critical for chat UX — feels responsive even when the full response takes seconds.
AI glossary
Returning the model's response token-by-token as it's generated, instead of waiting for the full reply. Critical for chat UX — feels responsive even when the full response takes seconds.