Spring AI ChatClient streaming does not work with LM Studio Local LLM API Server

Bug description I ran LM Studio Local LLM API Server (OpenAI compatibility mode). ChatClient.call() API works perfectly, but the ChatClient.stream() does not work!

I enabled debug logs and also used the SimpleLoggerAdvisor and saw this log for streaming:

DEBUG 54629 --- [streaming] [nio-8080-exec-6] o.s.web.servlet.DispatcherServlet        : POST "/stream", parameters={}
DEBUG 54629 --- [streaming] [nio-8080-exec-6] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped to dev.danvega.streaming.ChatController#chatWithStream(String)
DEBUG 54629 --- [streaming] [nio-8080-exec-6] m.m.a.RequestResponseBodyMethodProcessor : Read "application/json;charset=UTF-8" to ["{"message": "What is the largest country on Earth?"}"]
DEBUG 54629 --- [streaming] [nio-8080-exec-6] o.s.a.c.c.advisor.SimpleLoggerAdvisor    : request: AdvisedRequest[chatModel=OpenAiChatModel [defaultOptions=OpenAiChatOptions: {"streamUsage":false,"model":"meta-llama-3.1-8b-instruct","temperature":0.7}], userText={"message": "What is the largest country on Earth?"}, systemText=, chatOptions=OpenAiChatOptions: {"streamUsage":false,"model":"meta-llama-3.1-8b-instruct","temperature":0.7}, media=[], functionNames=[], functionCallbacks=[], messages=[], userParams={}, systemParams={}, advisors=[org.springframework.ai.chat.client.advisor.observation.ObservableRequestResponseAdvisor@6e5f4d0], advisorParams={}]
DEBUG 54629 --- [streaming] [nio-8080-exec-6] o.s.w.r.f.client.ExchangeFunctions       : [8220a7c] HTTP POST http://localhost:1234/v1/chat/completions
DEBUG 54629 --- [streaming] [nio-8080-exec-6] o.s.http.codec.json.Jackson2JsonEncoder  : [8220a7c] Encoding [ChatCompletionRequest[messages=[ChatCompletionMessage[rawContent={"message": "What is the largest co (truncated)...]
DEBUG 54629 --- [streaming] [nio-8080-exec-6] o.s.w.c.request.async.WebAsyncManager    : Started async request for "/stream"
DEBUG 54629 --- [streaming] [nio-8080-exec-6] o.s.web.servlet.DispatcherServlet        : Exiting but response remains open for further handling

But I could not see anything in the LM Studio server log!

Environment Spring AI version: 1.0.0-M6 Java version: 21 OS: macOS and Windows 11

Steps to reproduce I used the spring-ai-openai-spring-boot-starter lib with this configuration:

  ai:
    openai:
      api-key: lm-studio
      base-url: http://localhost:1234
      chat:
        options:
          model: meta-llama-3.1-8b-instruct
      embedding:
        options:
          model: text-embedding-nomic-embed-text-v1.5-embedding

Expected behavior When I call LM Studio Local LLM API Server (OpenAI compatibility mode) using the curl command (with stream enabled), the ChatClient.call() API also works with the above configuration.

Minimal Complete Reproducible example You can run an LM Studio Local LLM API Server (OpenAI compatibility mode) and clone this example from @danvega and apply my my configs. You will see that the ChatClient.call() works, but the stream() does not!

Comment From: dev-jonghoonpark

LM Studio Local LLM API Server (OpenAI compatibility mode)

The link seems to be invalid, returning a 404 error.

Comment From: aperepel

It is still an issue as of today's latest version, it seems. With the regular call version, everything works. The streaming version doesn't even trigger the model load in LM Studio.

See the code below to reproduce (replace with any small model).

OpenAiApi openAiApi = OpenAiApi.builder()
                .baseUrl("http://localhost:1234")
                .apiKey(new NoopApiKey())
                .restClientBuilder(RestClient.builder()
                        // Force HTTP/1.1 for both streaming and non-streaming
                        .requestFactory(new JdkClientHttpRequestFactory(HttpClient.newBuilder()
                                .version(HttpClient.Version.HTTP_1_1)
                                .connectTimeout(Duration.ofSeconds(30))
                                .build())))
                .build();

        OpenAiChatModel chatModel = OpenAiChatModel.builder()
                .openAiApi(openAiApi)
                .defaultOptions(OpenAiChatOptions.builder()
                        .model("medgemma-27b-text-it@8bit").build())
                .build();

// Non-streaming call works
//      String response = ChatClient.builder(chatModel)
//              .build()
//              .prompt("Hello")
//              .call()
//              .content();
//
//      System.out.println(response);

        System.out.println("Starting streaming chat response...\n");

// streaming version hangs
        ChatClient.builder(chatModel)
                .build()
                .prompt("Hello")
                .stream()
                .content()
                .doOnNext(chunk -> {
                    System.out.print(chunk);
                    System.out.flush(); // Ensure immediate output
                })
                .doOnComplete(() -> System.out.println("\n\nStreaming completed."))
                .doOnError(error -> System.err.println("Error during streaming: " + error.getMessage()))
                .blockLast(); // Block to wait for the stream to complete

Comment From: jspw

I've successfully resolved this streaming issue with LM Studio after an incredible debugging journey that led to a surprising discovery.

The Initial Problem

After extensive testing, I initially thought the issue was about HTTP/1.1 vs HTTP/2 protocols or Spring AI's default HttpClient configuration that doesn't work properly with LM Studio's streaming implementation.

My Investigation Journey

I went through multiple theories and tests:

Without custom config: Streaming fails completely ❌
With custom Reactor Netty HttpClient: Streaming works perfectly ✅
With Apache HttpClient 5: Streaming fails ❌
Protocol doesn't matter: Whether I set HTTP/1.1, HTTP/2, or nothing at all - it works as long as I provide a custom Reactor Netty HttpClient

Even when I explicitly configured HTTP/2, the logs showed it gracefully falls back to HTTP/1.1 when connecting to LM Studio, and streaming works fine.

The Plot Twist 🤯

Here's where it gets CRAZY: I commented out ALL my custom configuration code to test something, and streaming still worked!

This led me to the real discovery...

The ACTUAL Solution (Mind-Blowing Simple)

The entire issue was simply a missing dependency! Just add this to your pom.xml:

<dependency>
    <groupId>io.projectreactor.netty</groupId>
    <artifactId>reactor-netty-http</artifactId>
</dependency>

That's it! No custom configuration needed at all!

What Really Happened

Spring AI requires Reactor Netty for streaming functionality
Transitive dependency resolution sometimes doesn't include the correct version
Missing/incompatible reactor-netty-http → Streaming fails with LM Studio
Explicit dependency declaration → Correct version loaded → Everything works!

The Real Root Cause The problem was never about:

❌ HTTP protocol versions (HTTP/1.1 vs HTTP/2)
❌ Custom HttpClient configuration
❌ LM Studio compatibility issues
❌ Blocking vs reactive clients

It was simply: Missing reactor-netty-http dependency ✅

Comment From: jspw

I've successfully resolved this streaming issue with LM Studio after an incredible debugging journey that led to a surprising discovery.

The Initial Problem

After extensive testing, I initially thought the issue was about HTTP/1.1 vs HTTP/2 protocols or Spring AI's default HttpClient configuration that doesn't work properly with LM Studio's streaming implementation.

My Investigation Journey

I went through multiple theories and tests:

Without custom config: Streaming fails completely ❌

With custom Reactor Netty HttpClient: Streaming works perfectly ✅

With Apache HttpClient 5: Streaming fails ❌

Protocol doesn't matter: Whether I set HTTP/1.1, HTTP/2, or nothing at all - it works as long as I provide a custom Reactor Netty HttpClient

Even when I explicitly configured HTTP/2, the logs showed it gracefully falls back to HTTP/1.1 when connecting to LM Studio, and streaming works fine.

The Plot Twist 🤯

Here's where it gets CRAZY: I commented out ALL my custom configuration code to test something, and streaming still worked!

This led me to the real discovery...

The ACTUAL Solution (Mind-Blowing Simple)

The entire issue was simply a missing dependency! Just add this to your pom.xml:

io.projectreactor.netty reactor-netty-http That's it! No custom configuration needed at all!

What Really Happened

Spring AI requires Reactor Netty for streaming functionality

Transitive dependency resolution sometimes doesn't include the correct version

Missing/incompatible reactor-netty-http → Streaming fails with LM Studio

Explicit dependency declaration → Correct version loaded → Everything works!

The Real Root Cause The problem was never about:

❌ HTTP protocol versions (HTTP/1.1 vs HTTP/2)

❌ Custom HttpClient configuration

❌ LM Studio compatibility issues

❌ Blocking vs reactive clients

It was simply: Missing reactor-netty-http dependency ✅

The fact is, SpringAI uses WebClient (spring-webflux) and WebClient uses reactor-netty-http's HttpClient for streaming. But "reactor-netty-http" is optional in spring-webflux project. So, we have to explicitly add the dependency there.

Comment From: gillius

@jspw very odd, this fixed the issue for me as well (in my case I am using gradle so same solution to add explicit dependency on io.projectreactor.netty:reactor-netty-http works there as well. In my case before adding the dependency there was no reactor-netty-http of any version included, after I got version 1.2.9, so it doesn't seem to be an issue with incompatible/buggy version of reactor-netty-http.

I'm not sure exactly what this is doing? Is it changing what http client is used and auto-detected? If it was simply a missing dependency issue I'd expect to see something like ClassNotFoundException, so the presence of this jar must be changing the Spring auto-configuration on the beans. I really love how easy it is to do things in Spring Boot, but I feel like in recent years it's been getting "too magical" because it's so hard to figure out what is going on with bean set up now