Bug description When using the Spring AI chatClient.prompt().stream() API with OpenAI (tested on version 1.1.0-M3), the reported token usage is always zero.
Even when enabling the withStreamUsage(true) option in OpenAiChatOptions and collecting the final chunk via .stream().chatResponse(), the Usage object remains zero. This makes it impossible to track token consumption in real-time or after streaming.
This seems related to GitHub Issue #814 , but even after following the suggested approaches there, the problem persists.
Environment
Spring AI version: 1.1.0-M3
Java version: 17
Model: OpenAI GPT-3.5 / GPT-4
OS: Linux / MacOS
Vector store: N/A
Steps to reproduce
Configure a Spring AI OpenAI chat client.
Use the streaming API with chatClient.prompt().stream() or .stream().chatResponse().
Enable token usage streaming:
OpenAiChatOptions.builder().streamUsage(true).build()
Send a prompt that generates multiple chunks.
Observe the Usage object in ChatResponse.metadata.usage() or via SimpleLoggerAdvisor.
Example code:
Flux
chatResponseFlux.collectList().block().forEach(response -> { Usage usage = response.getMetadata().getUsage(); // Always zero System.out.println("Total tokens used: " + usage.getTotalTokens()); });
Expected behavior
Each chunk in the streaming response should include incremental token usage.
At minimum, the final chunk should reliably provide the total usage.
Usage data should be correctly reflected in SimpleLoggerAdvisor and ChatResponse.metadata.usage().
Actual behavior
All intermediate chunks report zero usage.
Even the final chunk after .collectList().block() reports zero.
Tested on Spring AI 1.1.0-M3 — issue persists.
.
Additional notes / attempts
Using .stream().chatResponse() and collecting the last chunk does not solve the problem.
Usage is zero in all tests, making real-time or post-stream token accounting impossible.
This appears to be a limitation in Spring AI’s deserialization/handling of OpenAiUsage in streaming scenarios.
Comment From: mrfnir
Has anyone managed to find a way to solve this issue?
Comment From: mrfnir
Friends who have a similar problem like me. Fortunately, in the last moments when I was close to despair, I realized an important point. If you use intermediate service providers to access openai services, some of them cause disruption in the use of the services and by changing them, fortunately the problem was solved and now I have the consumption token specifications. I hope this helps you :)
Comment From: ilayaperumalg
@mrfnir Thank you for updating your feedback and helping the community!