Bug description Xinference 1.4.1 , LLM qwen2-instruct, use python openai client ok, but use spring-ai-openai-spring-boot-starter and chat error

Xinference logs

2025-04-06 15:48:51 xinference | return await dependant.call(**values) 2025-04-06 15:48:51 xinference | File "/usr/local/lib/python3.10/dist-packages/xinference/api/restful_api.py", line 1945, in create_chat_completion 2025-04-06 15:48:51 xinference | raw_body = await request.json() 2025-04-06 15:48:51 xinference | File "/usr/local/lib/python3.10/dist-packages/starlette/requests.py", line 252, in json 2025-04-06 15:48:51 xinference | self._json = json.loads(body) 2025-04-06 15:48:51 xinference | File "/usr/lib/python3.10/json/init.py", line 346, in loads 2025-04-06 15:48:51 xinference | return _default_decoder.decode(s) 2025-04-06 15:48:51 xinference | File "/usr/lib/python3.10/json/decoder.py", line 337, in decode 2025-04-06 15:48:51 xinference | obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 2025-04-06 15:48:51 xinference | File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode 2025-04-06 15:48:51 xinference | raise JSONDecodeError("Expecting value", s, err.value) from None 2025-04-06 15:48:51 xinference | json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Environment Spring-AI version 1.0.0-M6 java version 17

Java demo

private static final String DEFAULT_PROMPT = "你好,介绍下你自己吧。"; private static final String JSON_OUTPUT_PROMPT = "how can I solve 8x + 7 = -23";

private final ChatModel openAiChatModel;

public OpenAiChatModelController(ChatModel chatModel) { this.openAiChatModel = chatModel; }

/* * 最简单的使用方式,没有任何 LLMs 参数注入。 * * @return String types. / @GetMapping("/simple/chat") public String simpleChat() {

return openAiChatModel.call(new Prompt(DEFAULT_PROMPT)).getResult().getOutput().getText(); }

yml

spring: application: name: spring-ai-alibaba-openai-chat-model-example ai: openai: api-key: "not empty" base-url: http://192.168.3.100:9997/

兼容其他OpenAI格式的大模型配置示例

chat: options:

模型ID,需要替换为实际的接入点ID

model: qwen2-instruct

java error logs org.springframework.ai.retry.TransientAiException: 500 - Internal Server Error: Expecting value: line 1 column 1 (char 0) at org.springframework.ai.autoconfigure.retry.SpringAiRetryAutoConfiguration$2.handleError(SpringAiRetryAutoConfiguration.java:108) ~[spring-ai-spring-boot-autoconfigure-1.0.0-M6.jar:1.0.0-M6] at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.StatusHandler.lambda$fromErrorHandler$1(StatusHandler.java:71) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.StatusHandler.handle(StatusHandler.java:146) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.applyStatusHandlers(DefaultRestClient.java:823) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$readBody$4(DefaultRestClient.java:812) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient.readWithMessageConverters(DefaultRestClient.java:215) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.readBody(DefaultRestClient.java:811) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$toEntityInternal$2(DefaultRestClient.java:767) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:571) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchange(DefaultRestClient.java:532) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.RestClient$RequestHeadersSpec.exchange(RestClient.java:677) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.executeAndExtract(DefaultRestClient.java:806) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toEntityInternal(DefaultRestClient.java:766) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toEntity(DefaultRestClient.java:755) ~[spring-web-6.2.0.jar:6.2.0] at org.springframework.ai.openai.api.OpenAiApi.chatCompletionEntity(OpenAiApi.java:257) ~[spring-ai-openai-1.0.0-M6.jar:1.0.0-M6

Comment From: dev-jonghoonpark

Could you provide the HTTP request bodies for both the Python OpenAI client and Spring AI?

Comment From: TalkIsCheapGiveMeMoney

python openai

Hypertext Transfer Protocol POST /v1/chat/completions HTTP/1.1\r\n Request Method: POST Request URI: /v1/chat/completions Request Version: HTTP/1.1 Host: localhost:9997\r\n Accept-Encoding: gzip, deflate\r\n Connection: keep-alive\r\n Accept: application/json\r\n Content-Type: application/json\r\n User-Agent: OpenAI/Python 1.70.0\r\n X-Stainless-Lang: python\r\n X-Stainless-Package-Version: 1.70.0\r\n X-Stainless-OS: Windows\r\n X-Stainless-Arch: other:amd64\r\n X-Stainless-Runtime: CPython\r\n X-Stainless-Runtime-Version: 3.11.9\r\n Authorization: Bearer not empty\r\n X-Stainless-Async: false\r\n x-stainless-retry-count: 0\r\n x-stainless-read-timeout: 600\r\n Content-Length: 95\r\n \r\n [Response in frame: 61] [Full request URI: http://localhost:9997/v1/chat/completions] File Data: 95 bytes JavaScript Object Notation: application/json

JSON raw form:
    {
        "messages": [
            {
                "content": "你是谁",
                "role": "user"
            }
        ],
        "model": "qwen2-instruct",
        "max_tokens": 1024
    }

Comment From: TalkIsCheapGiveMeMoney

spring-ai

Hypertext Transfer Protocol, has 2 chunks (including last chunk) POST /v1/chat/completions HTTP/1.1\r\n Request Method: POST Request URI: /v1/chat/completions Request Version: HTTP/1.1 Connection: Upgrade, HTTP2-Settings\r\n Host: 192.168.3.100:9997\r\n HTTP2-Settings: AAEAAEAAAAIAAAAAAAMAAAAAAAQBAAAAAAUAAEAAAAYABgAA\r\n Settings - Header table size : 16384 Settings Identifier: Header table size (1) Header table size: 16384 Settings - Enable PUSH : 0 Settings Identifier: Enable PUSH (2) Enable PUSH: 0 Settings - Max concurrent streams : 0 Settings Identifier: Max concurrent streams (3) Max concurrent streams: 0 Settings - Initial Windows size : 16777216 Settings Identifier: Initial Windows size (4) Initial Window Size: 16777216 Settings - Max frame size : 16384 Settings Identifier: Max frame size (5) Max frame size: 16384 Settings - Max header list size : 393216 Settings Identifier: Max header list size (6) Max header list size: 393216 Transfer-encoding: chunked\r\n Upgrade: h2c\r\n User-Agent: Java-http-client/17.0.14\r\n Authorization: Bearer not empty\r\n Content-Type: application/json\r\n \r\n [Full request URI: http://192.168.3.100:9997/v1/chat/completions] HTTP chunked response File Data: 143 bytes JavaScript Object Notation: application/json

JSON raw form:
    {
        "messages": [
            {
                "content": "你好,介绍下你自己!",
                "role": "user"
            }
        ],
        "model": "qwen2-instruct",
        "stream": false,
        "temperature": 0.7,
        "top_p": 0.7
    }

Comment From: dev-jonghoonpark

Does the same issue occur even after excluding "stream", "temperature", and "top_p"?

Comment From: TalkIsCheapGiveMeMoney

Xinference 1.4.1 API not support HTTP2 ,spring-ai use HTTP2 request Xinference API

Comment From: TalkIsCheapGiveMeMoney

SPRING-AI use jdk.internal.net.http.HttpClientImpl

Comment From: dev-jonghoonpark

If, as you mentioned, that seems to be the issue, I think we might be able to resolve it by approaching it this way. Would you like to take a look?

HttpClient httpClient = HttpClient.newBuilder().version(HttpClient.Version.HTTP_1_1).build();
JdkClientHttpRequestFactory jdkClientHttpRequestFactory = new JdkClientHttpRequestFactory(httpClient);
final RestClient.Builder builder = RestClient.builder().requestFactory(jdkClientHttpRequestFactory);

OpenAiApi openAiApi = OpenAiApi.builder()
        .baseUrl("{YOUR_BASE_URL}")
        .apiKey("{YOUR_API_KEY}")
        .restClientBuilder(builder)
        .build();

I referred to the code in the following GitHub issue: https://github.com/spring-projects/spring-framework/issues/33275

Comment From: TalkIsCheapGiveMeMoney

yes it works

Comment From: dev-jonghoonpark

Related issue : https://github.com/spring-projects/spring-ai/issues/2042

Comment From: dev-jonghoonpark

@TalkIsCheapGiveMeMoney The issue seems to be resolved. Please close it.