Expected Behavior
I noticed that the chat.completions.create
method in version 1.82.1
of the Python OpenAI package includes support for extra_headers
, extra_query
, and extra_body
.
completion = client.chat.completions.create(
model="qwen-plus-2025-04-28",
messages=messages,
extra_headers={
"Authorization": f"Bearer {os.getenv('DASHSCOPE_API_KEY')}"
},
extra_body={"enable_thinking": True},
stream=True
)
In OpenAiChatOptions
, we already have the httpHeaders
parameter which can serve as an alternative to extra_headers
. However, there are no corresponding alternatives for extra_query
and extra_body
.
Will Spring AI support these two parameters? Doing so might avoid the need for a lot of model-specific implementations.
Context https://pypi.org/project/openai/
Related https://github.com/spring-projects/spring-ai/issues/4324
Comment From: asaikali
Looks like extra_headers
, extra_query
, and extra_body
are supported in the offical python sdk, you can find them in the _base_client.py
https://github.com/openai/openai-python/blob/db5c35049accb05f5fb03791ef9c12547fd309a7/src/openai/_base_client.py#L1841
I searched through the openai python sdk codebase and found these parameters everywhere here is an example
these https://github.com/openai/openai-python/blob/db5c35049accb05f5fb03791ef9c12547fd309a7/src/openai/resources/chat/completions/completions.py#L119
Since a lot of providers offer openai compatible endpoints that can be serving models that have settings that don't have equivalent in the official openai api, spring-ai users can't send requests to these providers. For example the very popular vllm engine uses [Extra Parametes)[https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters] when serving models with options that can not be represented by the open AI API.
While extra_headers
, extra_query
, and extra_body
are not part of the official contract with api.openai.com they are part of the official python sdk. Adding these extra fields to spring-ai will make spring-ai usable against a wider range of models.
Comment From: markpollack
Thanks for hanging in there, folks — I’ve been iterating on a design to allow modification of the outgoing HTTP request, with the goal of improving compatibility across models and supporting the injection of arbitrary fields into the HTTP layer or request body.
As a summary. this functionality helps: - Enable experimental fields (e.g., OpenAI-Experimental) - Pass additional body fields for feature control or fine-tuning - Integrate with gateway-style models (e.g., vLLM, which uses an extra-parameters convention) - Accommodate vendor-specific extensions or naming patterns
The most prevalent use case so far is enriching requests to external models or hosted gateways that accept augmented fields beyond the standard OpenAI schema.
Here's a draft of a generic HTTP modification DSL, centered on a .http(...) method that allows structured request enrichment. I've implemented an initial version in a minimal form as a POC to trace how the data flows from ChatClient down to the lower-level request construction. This will help shape the finalized version based on feedback
here is a generic http modification design draft. I've implemented it in a more crude from to see how it plays out from ChatClient down into the lower levels as a POC and will work to polish that based on the feedback of the design. It also raised the point of having an arbitrary key/value pairs be added into the ChatOptions implementations so that when a new option appears, there isn't a need to update specific vendor implementations of ChatOptions but that will be a different issue/PR to address that need. Mentioning it since modifying the body here could achieve the same effect, but design wise it that use case should be addressed in the options.
ChatClient HTTP Enrichment Design
First Design pass contained in current WIP Branch: ExtraParameters
POJO
NOTE: READ ON, A BETTER FLUENT API IS TO FOLLOW, THIS IS JUST EXPLAINING THE CURRENT WIP BRANCH
the WIP branch is here: https://github.com/markpollack/spring-ai/pull/new/extra-3
In the first version of the ChatClient
DSL, per-request HTTP enrichment uses a ExtraParameters
object.
This contained:
public class ExtraParameters {
Map<String, String> headers;
MultiValueMap<String, String> queryParams;
Map<String, Object> bodyExtras;
}
Users passed this into .extra(...)
as a consumer:
chatClient.prompt("...")
.extra(extra -> extra
.header("X-Foo", "bar")
.queryParam("thinking", "true")
.body("tool_call_budget", 5)
)
.call();
Limitations
- Mixed transport-level (headers/query) and payload-level logic
- Could not remove fields or handle complex mutation
Current Design Proposal for this issue: Fluent .http()
API with extraBody(...)
Evolved into a Spring-style DSL where:
- Transport concerns live under
.http(...)
- Payload augmentation uses
.extraBody(...)
chatClient.prompt("...")
.http(http -> http
.header("X-Trace-ID", "abc")
.param("thinking", "true")
.extraBody("tool_call_budget", 5)
.extraBody(new OpenAiOptions(true, 3))
)
.call();
Key Benefits
Feature | Benefit |
---|---|
.param(...) |
Matches Spring MockMvc idioms |
.header(...) |
Matches WebClient/RestClient style |
.extraBody(...) |
Clearly conveys additive merge |
DSL Interface design
ChatClientRequestSpec http(Consumer<HttpRequestSpec> consumer);
and
public interface HttpRequestSpec {
HttpRequestSpec header(String name, String value);
HttpRequestSpec param(String name, String value);
HttpRequestSpec extraBody(String key, Object value);
HttpRequestSpec extraBody(Map<String, Object> values);
HttpRequestSpec extraBody(Object pojo);
}
Related follow up for related follow-up issues.
Forward-Compatible ChatOptions: extraFields
To support the reality that OpenAI (and other providers) frequently introduce new request fields before SDKs can adapt, we propose adding an extraFields map to ChatOptions implementations like OpenAiChatOptions.
Example:
public class OpenAiChatOptions {
private final Map<String, Object> extraFields = new LinkedHashMap<>();
public void add(String key, Object value) {
extraFields.put(key, value);
}
@JsonAnyGetter
public Map<String, Object> getExtraFields() {
return extraFields;
}
}
Usage
OpenAiChatOptions opts = new OpenAiChatOptions();
opts.add("thinking", true);
opts.add("tool_call_budget", 5);
This pattern allows developers to inject new fields immediately, without waiting for official support in Spring AI. These values are merged during JSON serialization via @JsonAnyGetter.
NOTE: This mechanism is the preferred place for model-specific configuration — not the general-purpose extraBody(...) API.
Future Pathway: bodyCustomizer(...)
While extraBody(...) handles additive behavior, some rare cases (e.g. when certain models reject fields like temperature) require removal or mutation of the request body before serialization.
A future feature might expose a callback:
.http(http -> http
.bodyCustomizer(body -> {
if ("gpt-3.5".equals(body.get("model"))) {
body.remove("temperature");
}
})
)
Use Cases
- Conditional field removal
- Per-model schema adaptation
- Field deduplication or normalization
- On-the-fly sanitization or transformation
This gives developers full control over the JSON payload while preserving type-safe defaults and allowing extraBody(...) for common additive use.
Future Pathway: HTTP Client Customization in ChatClient.Builder
Support to easily customize the underlying HTTP clients used by ChatClient for advanced behaviors like timeouts, interceptors, observability, and proxy setup.
DSL Example
ChatClient client = ChatClient.builder()
.defaultWebClientBuilder(web -> web
.defaultHeader("X-Env", "prod")
.clientConnector(new ReactorClientHttpConnector(
HttpClient.create().responseTimeout(Duration.ofSeconds(5)))
)
)
.defaultRestClientBuilder(rest -> rest
.defaultHeader("X-Env", "prod")
.requestFactory(new CustomClientHttpRequestFactory())
)
.build();
API Sketch
public interface ChatClient {
interface Builder {
Builder defaultWebClientBuilder(Consumer<WebClient.Builder> customizer);
Builder defaultRestClientBuilder(Consumer<RestClient.Builder> customizer);
ChatClient build();
// ... other existing builder methods ...
}
}
Use Cases
- Adding global headers for all requests
- Applying observability or retry filters
- Setting timeouts (e.g., via Netty)
- Injecting custom interceptors for debugging or instrumentation
defaultWebClientBuilder(builder ->
builder.clientConnector(new ReactorClientHttpConnector(
HttpClient.create().responseTimeout(Duration.ofSeconds(5))
))
)
or
.defaultRestClientBuilder(builder ->
builder.requestInterceptor((req, next) -> {
log.debug("Request: {}", req);
return next.exchange(req);
})
)
This configuration applies at client construction time, not per request. There could be shortcuts for common cases, e.g. logging interceptors , timeouts.
Comment From: tzolov
ChatClient provides support for non HTTP LLMs as well.
Therefore http
can not be used in the ChatClient DSL vocabulary.
Same applies for WebClient HttpClient. Those are implementation details for the backing ChatModels.
ChatClient can and should work with ChatModel that internally uses gRPC, Avro RPC, CORBA or any other RCP mechanism.
Comment From: hu1045274719
Thanks for hanging in there, folks — I’ve been iterating on a design to allow modification of the outgoing HTTP request, with the goal of improving compatibility across models and supporting the injection of arbitrary fields into the HTTP layer or request body.
As a summary. this functionality helps:
- Enable experimental fields (e.g., OpenAI-Experimental)
- Pass additional body fields for feature control or fine-tuning
- Integrate with gateway-style models (e.g., vLLM, which uses an extra-parameters convention)
- Accommodate vendor-specific extensions or naming patterns
The most prevalent use case so far is enriching requests to external models or hosted gateways that accept augmented fields beyond the standard OpenAI schema.
Here's a draft of a generic HTTP modification DSL, centered on a .http(...) method that allows structured request enrichment. I've implemented an initial version in a minimal form as a POC to trace how the data flows from ChatClient down to the lower-level request construction. This will help shape the finalized version based on feedback
here is a generic http modification design draft. I've implemented it in a more crude from to see how it plays out from ChatClient down into the lower levels as a POC and will work to polish that based on the feedback of the design. It also raised the point of having an arbitrary key/value pairs be added into the ChatOptions implementations so that when a new option appears, there isn't a need to update specific vendor implementations of ChatOptions but that will be a different issue/PR to address that need. Mentioning it since modifying the body here could achieve the same effect, but design wise it that use case should be addressed in the options.
ChatClient HTTP Enrichment Design
First Design pass contained in current WIP Branch:
ExtraParameters
POJONOTE: READ ON, A BETTER FLUENT API IS TO FOLLOW, THIS IS JUST EXPLAINING THE CURRENT WIP BRANCH
the WIP branch is here: https://github.com/markpollack/spring-ai/pull/new/extra-3
In the first version of the
ChatClient
DSL, per-request HTTP enrichment uses aExtraParameters
object.This contained:
public class ExtraParameters { Map
headers; MultiValueMap queryParams; Map bodyExtras; } Users passed this into .extra(...)
as a consumer:chatClient.prompt("...") .extra(extra -> extra .header("X-Foo", "bar") .queryParam("thinking", "true") .body("tool_call_budget", 5) ) .call();
Limitations
- Mixed transport-level (headers/query) and payload-level logic
- Could not remove fields or handle complex mutation
Current Design Proposal for this issue: Fluent
.http()
API withextraBody(...)
Evolved into a Spring-style DSL where:
- Transport concerns live under
.http(...)
- Payload augmentation uses
.extraBody(...)
chatClient.prompt("...") .http(http -> http .header("X-Trace-ID", "abc") .param("thinking", "true") .extraBody("tool_call_budget", 5) .extraBody(new OpenAiOptions(true, 3)) ) .call();
Key Benefits
Feature Benefit
.param(...)
Matches Spring MockMvc idioms.header(...)
Matches WebClient/RestClient style.extraBody(...)
Clearly conveys additive mergeDSL Interface design
ChatClientRequestSpec http(Consumer<HttpRequestSpec> consumer);
and
``` public interface HttpRequestSpec {
HttpRequestSpec header(String name, String value);
HttpRequestSpec param(String name, String value);
HttpRequestSpec extraBody(String key, Object value);
HttpRequestSpec extraBody(Map
values); HttpRequestSpec extraBody(Object pojo); } ```
Related follow up for related follow-up issues.
Forward-Compatible ChatOptions: extraFields
To support the reality that OpenAI (and other providers) frequently introduce new request fields before SDKs can adapt, we propose adding an extraFields map to ChatOptions implementations like OpenAiChatOptions.
Example:
public class OpenAiChatOptions { private final Map
extraFields = new LinkedHashMap<>(); public void add(String key, Object value) { extraFields.put(key, value); } @JsonAnyGetter public Map<String, Object> getExtraFields() { return extraFields; }
}
Usage
OpenAiChatOptions opts = new OpenAiChatOptions(); opts.add("thinking", true); opts.add("tool_call_budget", 5); This pattern allows developers to inject new fields immediately, without waiting for official support in Spring AI. These values are merged during JSON serialization via @JsonAnyGetter.
NOTE: This mechanism is the preferred place for model-specific configuration — not the general-purpose extraBody(...) API.
Future Pathway: bodyCustomizer(...)
While extraBody(...) handles additive behavior, some rare cases (e.g. when certain models reject fields like temperature) require removal or mutation of the request body before serialization.
A future feature might expose a callback:
.http(http -> http .bodyCustomizer(body -> { if ("gpt-3.5".equals(body.get("model"))) { body.remove("temperature"); } }) )
Use Cases
- Conditional field removal
- Per-model schema adaptation
- Field deduplication or normalization
- On-the-fly sanitization or transformation
This gives developers full control over the JSON payload while preserving type-safe defaults and allowing extraBody(...) for common additive use.
Future Pathway: HTTP Client Customization in ChatClient.Builder
Support to easily customize the underlying HTTP clients used by ChatClient for advanced behaviors like timeouts, interceptors, observability, and proxy setup.
DSL Example
ChatClient client = ChatClient.builder() .defaultWebClientBuilder(web -> web .defaultHeader("X-Env", "prod") .clientConnector(new ReactorClientHttpConnector( HttpClient.create().responseTimeout(Duration.ofSeconds(5))) ) ) .defaultRestClientBuilder(rest -> rest .defaultHeader("X-Env", "prod") .requestFactory(new CustomClientHttpRequestFactory()) ) .build();
API Sketch
``` public interface ChatClient {
interface Builder { Builder defaultWebClientBuilder(Consumer
customizer); Builder defaultRestClientBuilder(Consumer customizer); ChatClient build(); // ... other existing builder methods ... } } ``` Use Cases
- Adding global headers for all requests
- Applying observability or retry filters
- Setting timeouts (e.g., via Netty)
- Injecting custom interceptors for debugging or instrumentation
defaultWebClientBuilder(builder -> builder.clientConnector(new ReactorClientHttpConnector( HttpClient.create().responseTimeout(Duration.ofSeconds(5)) )) )
or
.defaultRestClientBuilder(builder -> builder.requestInterceptor((req, next) -> { log.debug("Request: {}", req); return next.exchange(req); }) )
This configuration applies at client construction time, not per request. There could be shortcuts for common cases, e.g. logging interceptors , timeouts.
When will this function be officially launched?
Comment From: Nuk3m1
My solution is writing a webclient interceptor which can translate the ChatCompletionRequest to JSON , so that we can add some extra parameters.The specific method is to extend the Jackson2JsonEncoder , so we can construct a new ExchangeStrategies instance and WebClient instance. Here is the Encoder class. `public class ChatRequestEncoder extends Jackson2JsonEncoder { private final ObjectMapper objectMapper ;
public ChatRequestEncoder(ObjectMapper objectMapper ,
Function<OpenAiApi.ChatCompletionRequest, Map<String, Object>> extraParameter) {
super(objectMapper, MediaType.APPLICATION_JSON);
this.objectMapper = objectMapper;
this.extraParameter = extraParameter;
}
@Override
public boolean canEncode(ResolvableType type, MimeType mimeType ) {
return super.canEncode(type, mimeType);
}
private final Function<OpenAiApi.ChatCompletionRequest, Map<String, Object>> extraParameter;
@Override
public DataBuffer encodeValue(Object value,
DataBufferFactory bufferFactory,
ResolvableType resolvableType,
MimeType mimeType,
Map<String,Object> thinking) {
if(value instanceof OpenAiApi.ChatCompletionRequest request) {
ObjectNode node = objectMapper.valueToTree(request);
Map<String, Object> extraParameters = extraParameter.apply(request);
if(extraParameters != null && !extraParameters.isEmpty()) {
node.set("thinking", objectMapper.valueToTree(extraParameters));
}
return super.encodeValue(
node,
bufferFactory,
ResolvableType.forInstance(node),
mimeType,
thinking
);
}
return super.encodeValue(value, bufferFactory, resolvableType, mimeType, thinking);
}
}We can construct the ChatRequestEncoder by the way "new ChatRequestEncoder(objectMapper , request -> Map.of("type","enabled"))".Finally , we can construct the WebClient as the param of ChatClient ,
private static final ExchangeStrategies thinkingEnabledStrategies = ExchangeStrategies.builder()
.codecs(c -> {
c.defaultCodecs().jackson2JsonDecoder(new Jackson2JsonDecoder(mapper, MediaType.APPLICATION_JSON));
c.defaultCodecs().jackson2JsonEncoder(ChatRequestEncoderFactory.thinkingEnabled(mapper));
})
.build();`
Comment From: YunKuiLu
Hi @markpollack @tzolov , can this issue be handled in 1.1.0 ?
Comment From: edwardcapriolo
{quote}
public class OpenAiChatOptions {
private final Map
public void add(String key, Object value) {
extraFields.put(key, value);
}
@JsonAnyGetter
public Map<String, Object> getExtraFields() {
return extraFields;
}
{quote}
^ This is the right idea. Don't overthink it, do like the python folks do "extra_fields". If you look at the swagger from a popular server like VLLM it is going to have "additionalProperties: true" all over the place. Today, I ran into this thing, qwen has thinking "on" by default and the
You can't make the problem type safe. You cant stay in sync :) I look at langchain/langchain4j/springai the vllm docs :) whatever they are all missing stuff.
The solution above for the webclient interceptor is involved. If anything I would just make a that OpenAiChatOptions to be GenericChatOptions or UberChatOptions.
We are already pretending this is that. VLLM is not open-ai, etc. VLLM doesn't support store:true. Maintaining a vllm subset in a project just doesn't make sense to me. Only the VLLM people would be authoritative about the features and if they dont care about java they wont work on upkeep. Already I cant open-api codegen vllm, their swagger as it has construct that Java doesn't support. (default for an object being '[]')
It is hard to be fighting the an uphill Java AI battle and then be trying to track every property and then get it in a release. (Even OpenAI has extra_fields)
Also its very hard to read all the docs and every figure out, can I set this can I set that, this think takes ".variables(Map)" this thing takes ".params()" new users sink a lot of time and then end up on a ticket like this, to realize it is impossible to do, without coming at it sideways (gut the object mappper while intercepting http requests yikes!)
Guess what? vllm now has reasoning_effort: string, so gotta include that, etc etc.