When using Flux<ChatClientResponse>, the context is returned with every response, which results in large data volume. This can be optimized — for example, only include the context in the response when chatResponse.result.metadata.finishReason == "stop".
Comment From: 192902649
public Flux<ChatClientResponse> chatV3(@PathVariable("chatId") String chatId,@RequestParam("message") String message) {
Flux<ChatClientResponse> flux = ollamaChatClient.prompt()
.user(message)
.advisors(a -> a.param(ChatMemory.CONVERSATION_ID, chatId))
.stream().chatClientResponse();
return flux;
}
Comment From: 192902649
Comment From: vinupreethi
@ilayaperumalg i would like to work on this.
Comment From: ilayaperumalg
@vinupreethi sure, please. thank you for your interest!