Bug description Unable to use embedding functionality when integrating Spring AI with VLLM using OpenAI API format. The issue appears to be related to HTTP protocol version compatibility.
Environment Spring AI: 1.0.0-M4 JDK: 22 VLLM: 0.6.6.post1
Steps to reproduce 1. Set up VLLM server
vllm serve Alibaba-NLP/gte-Qwen2-1.5B-instruct --task embed --tokenizer Alibaba-NLP/gte-Qwen2-1.5B-instruct
- Configure Spring AI to use VLLM endpoint
- Attempt to use embedding functionality
@RestController
@RequestMapping("/ai")
public class EmbeddingController {
private final EmbeddingModel embeddingModel;
@Autowired
public EmbeddingController(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@GetMapping("/embedding")
public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));
return Map.of("embedding", embeddingResponse);
}
}
Expected behavior Return embedding results.
Current Behavior When sending requests to VLLM using Spring AI's OpenAI API implementation: 1. VLLM logs:
WARNING: Unsupported upgrade request.
INFO: - "POST /v1/embeddings HTTP/1.1" 400 Bad Request
- Spring AI throws:
org.springframework.ai.retry.NonTransientAiException: 400 - {
"object": "error",
"message": "[{'type': 'missing', 'loc': ('body',), 'msg': 'Field required', 'input': None}]",
"type": "BadRequestError",
"param": null,
"code": 400
}
The request being sent contains:
{
"method": "POST",
"headers": {
"Connection": "Upgrade, HTTP2-Settings",
"Host": "127.0.0.1:8080",
"Http2-Settings": "AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA",
"Transfer-Encoding": "chunked",
"Upgrade": "h2c",
"User-Agent": "Java-http-client/22.0.1",
"Authorization": "Bearer test",
"Content-Type": "application/json"
}
}
Current Workaround I've implemented a temporary solution by forcing HTTP/1.1:
@Component
public class RestClientCustomizer {
@Bean
@Primary
public RestClient.Builder customRestClientBuilder() {
HttpClient httpClient = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_1_1)
.build();
return RestClient.builder()
.requestFactory(new JdkClientHttpRequestFactory(httpClient));
}
}
🤔Perhaps a better solution would be for VLLM to provide support.
Comment From: dev-jonghoonpark
@icyclv Which Vector Databases are you experiencing this issue in? In my case, I am experiencing the above issue in chroma.
Comment From: icyclv
@dev-jonghoonpark I am using Elasticsearch as the vector database, but the issue I'm encountering isn't occurring during the storage-to-database phase, but rather during the stage of embedding text into vectors.
Comment From: dev-jonghoonpark
ChromaVectorStoreAutoConfiguration.java
When a ChromaApi
instance is created, RestClient.builder()
is used as follows:
var chromaApi = new ChromaApi(chromaUrl, restClientBuilderProvider.getIfAvailable(RestClient::builder), objectMapper);
The RestClient.builder()
returns DefaultRestClientBuilder
:
static Builder builder() {
return new DefaultRestClientBuilder();
}
DefaultRestClientBuilder.java
when build()
method of DefaultRestClientBuilder
is invoked, it call initRequestFactory()
method
private ClientHttpRequestFactory initRequestFactory() {
if (this.requestFactory != null) {
return this.requestFactory;
} else if (httpComponentsClientPresent) {
return new HttpComponentsClientHttpRequestFactory();
} else if (jettyClientPresent) {
return new JettyClientHttpRequestFactory();
} else if (reactorNettyClientPresent) {
return new ReactorClientHttpRequestFactory();
} else {
return (ClientHttpRequestFactory)(jdkClientPresent ? new JdkClientHttpRequestFactory() : new SimpleClientHttpRequestFactory());
}
}
finally, a JdkClientHttpRequestFactory
instance selected.
JdkClientHttpRequestFactory.java
The JdkClientHttpRequestFactory
uses HttpClient
, witch is created by calling HttpClient.newHttpClient()
. By default, it uses HTTP/2.
Refer to HttpClient.html#newHttpClient() for more details:
The default settings include: the "GET" request method, a preference of HTTP/2, a redirection policy of NEVER, the default proxy selector, and the default SSL context.
Comment From: reneleonhardt
@icyclv vllm decided to use uvicorn as ASGI server (HTTP 1 only, the roadmap lists HTTP 2 and 3). If they would switch to a more modern ASGI server like hypercorn or granian, they could serve HTTP 1 and 2 (hypercorn even supports HTTP 3): https://fastapi.tiangolo.com/deployment/manually/#asgi-servers
I can't find an issue or pull request regarding uvicorn or ASGI or HTTP/2 (they will never tackle their backlog 🙈😅). @dev-jonghoonpark maybe you want to open a feature request to migrate to granian? hypercorn seems too unstable when I look at their issue backlog.
Comment From: zachary-zhaoqi
有一个临时方法,可以通过nginx做代理,将 HTTP/2 请求转换为 HTTP/1.1 转发给 Uvicorn:
server {
listen 443 ssl;
http2 on;
location / {
proxy_pass http://localhost:8000; # 转发到 Uvicorn
proxy_http_version 1.1; # 强制使用 HTTP/1.1
}
}
vllm部署qwen2.5-32B 使用spring ai M6版本连接成功,call stream 都可以
Comment From: kangnn
有一个临时方法,可以通过nginx做代理,将 HTTP/2 请求转换为 HTTP/1.1 转发给 Uvicorn:
server { listen 443 ssl; http2 on;
location / { proxy_pass http://localhost:8000; # 转发到 Uvicorn proxy_http_version 1.1; # 强制使用 HTTP/1.1 }
} vllm部署qwen2.5-32B 使用spring ai M6版本连接成功,call stream 都可以
请问是只修改了这个nginx代理,java中直接用openai那个接口吗
@Bean
public ChatClient chatClient(OpenAiChatModel model) {
return ChatClient
.builder(model)
.build();
}
调用:
@RequestMapping(value = "/chat",produces = "text/html;charset=utf-8")
public Flux<String> chat(String prompt) {
return chatClient.prompt()
.user(prompt)
.stream()
.content();
}
Comment From: Dudu0831
我找到了一个新的解决方案,不需要去麻烦的使用nginx转换,只需要配置如下
然后在使用时注入新的客户端
Comment From: xyombo
我的解决方案:直接自定义RestClient 和 WebClient ,这样是比较保险的配置方式了,因为我还没搞明白什么情况下会用RestClient 或WebClient,但好歹是run起来了 :)
@Configuration
@Slf4j
public class HttpClientConfiguration {
@Bean
public RestClient.Builder customRestClientBuilder() {
HttpClient httpClient = HttpClient.newBuilder().version(HttpClient.Version.HTTP_1_1).build();
log.debug("Using custom RestClient.Builder with HTTP/1.1");
return RestClient.builder().requestFactory(new JdkClientHttpRequestFactory(httpClient));
}
@Bean
public WebClient.Builder customWebClientBuilder() {
HttpClient httpClient = HttpClient.newBuilder().version(HttpClient.Version.HTTP_1_1).build();
log.debug("Using custom WebClient.Builder with HTTP/1.1");
return WebClient.builder().clientConnector(new JdkClientHttpConnector(httpClient));
}
}