When using Flux<ChatClientResponse>, the context is returned with every response, which results in large data volume. This can be optimized — for example, only include the context in the response when chatResponse.result.metadata.finishReason == "stop".

Image

Comment From: 192902649

public Flux<ChatClientResponse> chatV3(@PathVariable("chatId") String chatId,@RequestParam("message") String message) {
    Flux<ChatClientResponse> flux = ollamaChatClient.prompt()
            .user(message)
            .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, chatId))
            .stream().chatClientResponse();

    return flux;
}

Comment From: 192902649

Image

Comment From: vinupreethi

@ilayaperumalg i would like to work on this.

Comment From: ilayaperumalg

@vinupreethi sure, please. thank you for your interest!

Comment From: vinupreethi

@192902649 one doubt, do i need to modify like this.


  .map(chatResponse -> {
                    boolean isStop = "stop".equals(
                            chatResponse.getResult().getMetadata().getFinishReason()
                    );
                    return ChatClientResponse.builder()
                            .chatResponse(chatResponse)
                            .context(isStop ? Map.copyOf(chatClientRequest.context()) : Map.of())
                            .build();
                })
                .publishOn(Schedulers.boundedElastic());

can you help me on how to test the change

Comment From: vinupreethi

https://github.com/spring-projects/spring-ai/pull/3861/files @ilayaperumalg @192902649 let me know in case of any issues, i was trying to mock the testcases and check

Image

Image

Comment From: 192902649

@vinupreethi
Since different AI model providers (e.g., OpenAI, Ollama, ZhiPuAi, HuggingFace, DeepSeek, etc.) have varying implementations and definitions of the finishReason field, the context property should be moved from ChatClientResponse to ChatResponse. Additionally, a new method boolean isStop(ChatResponse chatResponse) should be added to the ChatModel interface, allowing each model provider to implement its own logic for determining whether the response indicates a stop condition.

Image

default boolean isStop(ChatResponse chatResponse) {

    return (null !=chatResponse.getResult()
            && null !=chatResponse.getResult().getMetadata()
            && null !=chatResponse.getResult().getMetadata().getFinishReason()
            && "stop".equals(chatResponse.getResult().getMetadata().getFinishReason()));
}

Image

public Flux<ChatResponse> stream(ChatClientRequest chatClientRequest) {
    // Before moving any further, build the final request Prompt,
    // merging runtime and default options.
    Prompt prompt = chatClientRequest.prompt();
    Prompt requestPrompt = buildRequestPrompt(prompt);

    Flux<ChatResponse> responseFlux = this.internalStream(requestPrompt, null);

    return responseFlux.map(chatResponse -> isStop(chatResponse)
            ? chatResponse.setContext(chatClientRequest.context()) :chatResponse);
}


@Override
public boolean isStop(ChatResponse chatResponse) {

    return (null !=chatResponse.getResult()
            && null !=chatResponse.getResult().getMetadata()
            && null !=chatResponse.getResult().getMetadata().getFinishReason()
            && "stop12".equals(chatResponse.getResult().getMetadata().getFinishReason()));
}

Comment From: vinupreethi

@192902649 hey..! thanks. will check the code.. can you help on how to test this once the changes is done, im newbie .. so how can we test it. i haven't run the application. im just running via testcases

Comment From: Abhishek-Pal-04

@vinupreethi you may want to look into the contribution doc available here - https://github.com/spring-projects/spring-ai/blob/main/CONTRIBUTING.adoc