๐ Bug description
When using the streaming mode (Flux
This behavior becomes particularly problematic when internalToolExecutionEnabled=false, where tool execution is intended to be controlled manually by the user. In such workflows, it's necessary to retrieve tool call information from the last assistant message in memory, but that data is missing due to the above issue.
Note: This issue is not caused by setting internalToolExecutionEnabled=false. Instead, the issue is exacerbated by it, since downstream components rely on consistent toolCalls data across both streaming and non-streaming modes.
๐ป Environment Spring AI Version: 1.0.0
Java Version: 17
Model: Qwen2.5-72B-Instruct
Usage Mode: Streaming (Flux
Tool Execution Mode: internalToolExecutionEnabled=false
Vector store: Not involved
๐ช Steps to reproduce Configure a Spring AI chat client with streaming mode enabled.
Ensure that the response from the model includes tool calls.
Use org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
Inspect the AssistantMessage โ toolCalls is missing.
Attempt to retrieve tool calls from memory (e.g., ChatMemory#getMessages) โ fails.
โ Expected behavior The toolCalls property from the GenerationMetadata should be correctly propagated to the resulting AssistantMessage, regardless of whether streaming or non-streaming mode is used. This ensures consistent memory behavior and supports downstream workflows such as manual tool execution confirmation.
๐งช Minimal Complete Reproducible example
My AI Config: @Bean public ChatMemoryRepository chatMemoryRepository() { return new InMemoryChatMemoryRepository(); }
@Bean
public ChatMemory chatMemory(ChatMemoryRepository chatMemoryRepository) {
return MessageWindowChatMemory.builder().maxMessages(10).chatMemoryRepository(chatMemoryRepository).build();
}
@Bean
public OpenAiChatModel chatModel(OpenAiApi openAiApi, ToolCallingManager toolCallingManager, List<AgentToolsProvider> agentToolsProviders) {
AgentToolsProvider[] providers = agentToolsProviders.toArray(new AgentToolsProvider[0]);
ToolCallback[] toolCallbacks = ToolCallbacks.from((Object[]) providers);
OpenAiChatOptions chatOptions = OpenAiChatOptions.builder()
.temperature(0.6)
.model("qwen2.5-72b-instruct")
.internalToolExecutionEnabled(false)
.toolCallbacks(toolCallbacks)
.build();
return OpenAiChatModel.builder()
.defaultOptions(chatOptions)
.toolCallingManager(toolCallingManager)
.openAiApi(openAiApi)
.build();
}
@Bean
ChatClient chatClient(OpenAiChatModel chatModel, ChatMemory chatMemory) {
return ChatClient.builder(chatModel)
.defaultAdvisors(
new SimpleLoggerAdvisor(),
MessageChatMemoryAdvisor.builder(chatMemory).build()
)
.defaultSystem(systemResource)
.build();
}
Just chat with AI:
private Flux<ChatResponse> callWithMemory(String conversationId, String userText) {
Prompt promptWithMemory = new Prompt(chatMemory.get(conversationId), chatModel.getDefaultOptions());
return client.prompt(promptWithMemory)
.user(userText)
.advisors(a -> a.param(ChatMemory.CONVERSATION_ID, conversationId))
.stream()
.chatResponse();
}
Comment From: WOONBE
I'd like to contribute this issue!
Comment From: ilayaperumalg
@EkkoWH Thanks for the detailed report. @WOONBE Thanks for the PR fixing it.
Comment From: EkkoWH
I'd like to contribute this issue!
@WOONBE Thank you very much for your PR!
Comment From: EkkoWH
@EkkoWH Thanks for the detailed report. @WOONBE Thanks for the PR fixing it.
@ilayaperumalg Thanks for your reply!