Bug description
When using in #stream()
mode along with tools, the Anthropic Chat Model outputs that it "needs to use XYZ tool" before actually calling the tool and giving the final result. The result of the call is the response to the two prompts, the first one before passing tool results, and the second one with the tool results.
Environment Running on Spring AI 1.1.0-SNAPSHOT JDK 17 Using Anthropic model ; only passing the API key as a property
Steps to reproduce
- Set up a project with Spring AI and Anthropic model
- Add a simple tool
- Call the in streaming mode with your tool:
ChatClient.create(chatModel).prompt(question)
.tools(new WeatherTool())
.stream()
// ... consume response
The output contains too much info, for example:
I'd be happy to check the temperature in Paris for you. To do this, I need the specific latitude and longitude coordinates for Paris, France.The current temperature in Paris is 14.17°C (approximately 57.5°F). This is a mild temperature, typical of a cool evening in Paris. The weather is neither too hot nor too cold, making it comfortable for outdoor activities with a light jacket.
Expected behavior The final output of the console should read something like:
The current temperature in Paris is 14.17°C (approximately 57.5°F). This is a mild temperature, typical of a cool evening in Paris. The weather is neither too hot nor too cold, making it comfortable for outdoor activities with a light jacket.
Minimal Complete Reproducible example
Here's a repro project: https://github.com/Kehrlann/spring-ai-streamingtools-repro
- Set your Anthropic key in the
application.properties
or in an env varANTHROPIC_API_KEY
- Run the project. It asks "What's the weather like in Paris?", and has fake tool that returns a random temperature
- Observe the results in the console
Comment From: BrayanMnz
Hi @Kehrlann - Did you try disable thinking mode?
Comment From: tzolov
@Kehrlann thanks for highlighting this behavior.
By default, when the Spring AI internal tool calling execution is enabled, for a prompt Get current weather in Amsterdam
with getCurrentDateTime()
tool, Anthropic streams a response like:
MessageStartEvent[type=MESSAGE_START, message=ChatCompletionResponse[id=msg_011jph3xTqsNSWCgMCQqGn2K, type=message, role=ASSISTANT, content=[], model=claude-3-7-sonnet-20250219, stopReason=null, stopSequence=null, usage=Usage[inputTokens=411, outputTokens=3]]]
ContentBlockStartEvent[type=CONTENT_BLOCK_START, index=0, contentBlock=ContentBlockText[type=text, text=]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text=I'll get]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text= the current weather forecast for]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text= Amsterdam for you.]]
ContentBlockStopEvent[type=CONTENT_BLOCK_STOP, index=0]
ContentBlockStartEvent[type=CONTENT_BLOCK_START, index=1, contentBlock=ContentBlockToolUse[type=tool_use, id=toolu_018CYrFUVXsNejgDbR9VNFE6, name=getCurrentDateTime, input={}]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=1, delta=ContentBlockDeltaJson[type=input_json_delta, partialJson=]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=1, delta=ContentBlockDeltaJson[type=input_json_delta, partialJson={"ci]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=1, delta=ContentBlockDeltaJson[type=input_json_delta, partialJson=tyName]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=1, delta=ContentBlockDeltaJson[type=input_json_delta, partialJson=": "Amsterd]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=1, delta=ContentBlockDeltaJson[type=input_json_delta, partialJson=am]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=1, delta=ContentBlockDeltaJson[type=input_json_delta, partialJson="}]]
ContentBlockStopEvent[type=CONTENT_BLOCK_STOP, index=1]
MessageDeltaEvent[type=MESSAGE_DELTA, delta=MessageDelta[stopReason=tool_use, stopSequence=null], usage=MessageDeltaUsage[outputTokens=66]]
MessageStopEvent[type=MESSAGE_STOP]
MessageStartEvent[type=MESSAGE_START, message=ChatCompletionResponse[id=msg_01KSYk6t1cGtnDkVUV2V6KeY, type=message, role=ASSISTANT, content=[], model=claude-3-7-sonnet-20250219, stopReason=null, stopSequence=null, usage=Usage[inputTokens=492, outputTokens=2]]]
ContentBlockStartEvent[type=CONTENT_BLOCK_START, index=0, contentBlock=ContentBlockText[type=text, text=]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text=Base]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text=d on the current weather in Amsterdam, it]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text='s hot and sunny with]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text= a temperature of 20 degrees Celsius.]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text= It's a nice day in]]
ContentBlockDeltaEvent[type=CONTENT_BLOCK_DELTA, index=0, delta=ContentBlockDeltaText[type=text_delta, text= Amsterdam!]]
ContentBlockStopEvent[type=CONTENT_BLOCK_STOP, index=0]
MessageDeltaEvent[type=MESSAGE_DELTA, delta=MessageDelta[stopReason=end_turn, stopSequence=null], usage=MessageDeltaUsage[outputTokens=36]]
MessageStopEvent[type=MESSAGE_STOP]
The first message content block streams (a thinking) text like: I'll get -> the current weather forecast for -> Amsterdam for you.
.
then the actual tool call content block is streamed
and finally the final message with the answer.
While we should filter out our tool-calling messages from the response (unless the internal support is disabled), we can not pre-filter the thinking
type of messages.
You can though ask Anthropic not to include "thinking" messages in the response. Like: Get current weather in Amsterdam. Please don't explain that you will call tools.
B.S. while exploring this I stumbled on a bug where we didn't filter out the tool-calling response when not requested. The #3679 fixes it.
Comment From: ilayaperumalg
Closing as resolved.