Spring AI Reasoning options documented in docs

Expected Behavior I would expect that it would be easy to add reasoning_effort (or similar) to the models that support it.

Current Behavior Not defined in the documentation. At least for claude sonnet 4 trough bedrock. After a lot of searching, since anthrophic doesnt use the nomenclature reasoning_effort (whyyyyyy), I found a ThinkingConfig within the https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/AnthropicChatOptions.java

I would expect all of the different options to be well documented in the documentation, such that I do not have to go into the code.

Context We are considering / experimenting using SpringAI in our org for our agent efforts, but this requires good and up to date documentation.

Comment From: kaichen1007

If I understand correctly, you should not be able to get the reasoning process of LLM. I also found this problem when using Qwen. Here's what I did. 1. Add the corresponding response field

public record ChatCompletionMessage(
            @JsonProperty("content") Object rawContent,
            @JsonProperty("role") AiCommonApi.ChatCompletionMessage.Role role,
            @JsonProperty("name") String name,
            @JsonProperty("tool_call_id") String toolCallId,
            @JsonProperty("tool_calls") @JsonFormat(with = JsonFormat.Feature.ACCEPT_SINGLE_VALUE_AS_ARRAY) List<AiCommonApi.ChatCompletionMessage.ToolCall> toolCalls,
            @JsonProperty("refusal") String refusal,
            @JsonProperty("audio") AiCommonApi.ChatCompletionMessage.AudioOutput audioOutput,
            @JsonProperty("annotations") List<AiCommonApi.ChatCompletionMessage.Annotation> annotations,
            /** qwen reasoning_content */
            @JsonProperty("reasoning_content") String reasoningContent
    )

merge fields

private ChatCompletionMessage merge(ChatCompletionMessage previous, ChatCompletionMessage current) {
        String content = (current.content() != null ? current.content()
                : "" + ((previous.content() != null) ? previous.content() : ""));
        String reasoningContent = (current.reasoningContent() != null ? current.reasoningContent()
                : "" + ((previous.reasoningContent() != null) ? previous.reasoningContent() : ""));
……
return new ChatCompletionMessage(content, role, name, toolCallId, toolCalls, refusal, audioOutput, annotations,reasoningContent);
}

Add to metadata

protected Flux<ChatResponse> internalStream(Prompt prompt, ChatResponse previousChatResponse) {
……
……
Map<String, Object> metadata = Map.of(
                                        "id", id,
                                        "role", roleMap.getOrDefault(id, ""),
                                        "index", choice.index(),
                                        "finishReason", choice.finishReason() != null ? choice.finishReason().name() : "",
                                        "refusal", StringUtils.hasText(choice.message().refusal()) ? choice.message().refusal() : "",
                                        "reasoningContent", StringUtils.hasText(choice.message().reasoningContent()) ? choice.message().reasoningContent() : "",
                                        "annotations", choice.message().annotations() != null ? choice.message().annotations() : List.of());
……
……
}

get metadata

(String) chatResponse.getResult().getOutput()
                                        .getMetadata().getOrDefault("reasoningContent", "")

Comment From: kim-borgen

@kaichen1007 I was mentioning reasoning_effort, aka the option that must be enabled on major models to use reasoning, and set an "effort" to Low, Medium, or High. Not the actual content.

You most definitely can get the reasoning blocks / content from supported models / LLM inference providers, however I would except that if reasoning blocks are available on a certain model (Aka qwen or sonnet) that this is baked into the framework and clearly documentated, so you do not have to "hack" the framework, as you did :p

Comment From: a1446234064

@kaichen1007 I was mentioning reasoning_effort, aka the option that must be enabled on major models to use reasoning, and set an "effort" to Low, Medium, or High. Not the actual content.

You most definitely can get the reasoning blocks / content from supported models / LLM inference providers, however I would except that if reasoning blocks are available on a certain model (Aka qwen or sonnet) that this is baked into the framework and clearly documentated, so you do not have to "hack" the framework, as you did :p

@kaichen1007 I was mentioning reasoning_effort, aka the option that must be enabled on major models to use reasoning, and set an "effort" to Low, Medium, or High. Not the actual content.

You most definitely can get the reasoning blocks / content from supported models / LLM inference providers, however I would except that if reasoning blocks are available on a certain model (Aka qwen or sonnet) that this is baked into the framework and clearly documentated, so you do not have to "hack" the framework, as you did :p

If my model is an inference model. How should I retrieve reasoningContent from the response returned by chatClient, or how do I need to set it up

Comment From: kaichen1007

@kim-borgen Yes, you're right.

Comment From: kaichen1007

@a1446234064 My current solution is to rewrite and abstract the framework's code.

This has two main purposes. 1. Seamlessly introduce different LLM. 2. Easy to create Agent.

@Bean(name = "questionAgent")
    public QuestionAgent questionAgent(AiCommonApi qwenApi){
        return QuestionAgent.builder()
                .agentName("questionAgent")
                .agentIntroduction("我是个一个出题师，有着丰富的出题经验，精通各个科目的出题技巧")
                .chatModel(
                        QwenChatModel.builder()
                        .aiCommonApi(qwenApi)
                        .options(
                                QwenChatOptions.builder()
                                        .model(ChatModelEnums.QWEN_3_30B)
                                        .temperature(0.5)
                                        .maxTokens(8000)
                                        .build()
                                        .enableThinking(false)
                        ).build()
                )
                .prompt(AgentPromptConstants.QUESTION_AGENT_SYSTEM_TEMPLATE)
                .build();
    }

The most imsportant thing is to amplify the modular design idea of the framework.