Expected Behavior

I would like Spring AI to support the Anthropic Claude models available on the Google Vertex platform, allowing seamless access to these models through Google Vertex.

Current Behavior

While Spring AI currently supports Anthropic Claude models, it does not support the versions available specifically on the Google Vertex platform. This limitation prevents integration for users who rely on Google Vertex for their AI infrastructure.

Context

This issue affects our ability to utilize the Anthropic Claude models hosted on Google Vertex, as we rely on this platform for our AI deployments. We would like to have direct support to streamline our model integration process without resorting to alternative solutions.

Comment From: sleepo581

Bump

Comment From: andruhon

This can be worked around with custom implementation of chat model

Proof of cocept (Kotlin):

package org.example

import com.fasterxml.jackson.annotation.JsonInclude
import com.fasterxml.jackson.annotation.JsonProperty
import com.fasterxml.jackson.databind.ObjectMapper
import com.google.api.HttpBody
import com.google.cloud.aiplatform.v1.PredictionServiceClient
import com.google.protobuf.ByteString
import org.springframework.ai.chat.messages.AssistantMessage
import org.springframework.ai.chat.messages.MessageType
import org.springframework.ai.chat.model.ChatModel
import org.springframework.ai.chat.model.ChatResponse
import org.springframework.ai.chat.model.Generation
import org.springframework.ai.chat.prompt.Prompt

/**
 * Custom Spring AI ChatModel implementation for Anthropic's Claude models
 * hosted on Google Cloud Vertex AI as a "Publisher Model".
 *
 * This class acts as a bridge, wrapping the Google Cloud Java SDK's
 * PredictionServiceClient and translating between Spring AI's Prompt/ChatResponse
 * objects and the specific JSON payload required by the Vertex AI rawPredict endpoint.
 */
class VertexAiAnthropicChatModel(
    private val predictionServiceClient: PredictionServiceClient,
    private val objectMapper: ObjectMapper,
    private val modelEndpoint: String,
    private val defaultOptions: VertexAiAnthropicChatOptions
) : ChatModel {

    companion object {
        private const val ANTHROPIC_VERTEX_VERSION = "vertex-2023-10-16"
    }

    /**
     * Wrapping `:rawPredict` is a crucial part,
     * otherwise it would have been possible to call a correct URL with REST transport with setting
     * model name to `projects/YOURPROJECT/locations/global/publishers/anthropic/models/claude-sonnet-4-5@20250929`,
     * this is unfortunately not enough, because anthropic endpoint does not have `:generateContent`.
     */
    override fun call(prompt: Prompt): ChatResponse {
        try {
            // 1. Create the Vertex AI-specific Request POJO
            val request = createVertexRequest(prompt)

            // 2. Serialize POJO to JSON string
            val requestJson = objectMapper.writeValueAsString(request)

            // 3. Build the Google SDK HttpBody
            val httpBody = HttpBody.newBuilder()
                .setContentType("application/json")
                .setData(ByteString.copyFromUtf8(requestJson))
                .build()

            // 4. Call the SDK's rawPredict method
            val responseBody = predictionServiceClient.rawPredict(modelEndpoint, httpBody)

            // 5. Deserialize the JSON response
            val responseJson = responseBody.data.toStringUtf8()
            val response = objectMapper.readValue(responseJson, AnthropicVertexResponse::class.java)

            // 6. Translate back to Spring AI's ChatResponse
            return createChatResponse(response)

        } catch (e: Exception) {
            throw RuntimeException("Failed to call Vertex AI Anthropic model: ${e.message}", e)
        }
    }

    private fun createVertexRequest(prompt: Prompt): AnthropicVertexRequest {
        // Merge default options with runtime options
        var options = defaultOptions
        if (prompt.options != null && prompt.options is VertexAiAnthropicChatOptions) {
            val runtimeOptions = prompt.options as VertexAiAnthropicChatOptions
            options = VertexAiAnthropicChatOptions(
                temperature = runtimeOptions.temperature ?: options.temperature,
                maxTokens = runtimeOptions.maxTokens ?: options.maxTokens,
                topK = runtimeOptions.topK ?: options.topK,
                topP = runtimeOptions.topP ?: options.topP
            )
        }

        // Extract system message if present
        val systemPrompt = prompt.instructions
            .firstOrNull { it.messageType == MessageType.SYSTEM }
            ?.text

        // Translate Spring AI Messages to Anthropic Messages (excluding system messages)
        val messages = prompt.instructions
            .filter { it.messageType != MessageType.SYSTEM }
            .map { msg ->
                AnthropicVertexRequest.Message(
                    role = msg.messageType.value,
                    content = msg.text
                )
            }

        return AnthropicVertexRequest(
            anthropic_version = ANTHROPIC_VERTEX_VERSION,
            system = systemPrompt,
            messages = messages,
            max_tokens = options.maxTokens,
            temperature = options.temperature,
            top_k = options.topK,
            top_p = options.topP
        )
    }

    private fun createChatResponse(response: AnthropicVertexResponse): ChatResponse {
        val generations = response.content.map { contentBlock ->
            Generation(AssistantMessage(contentBlock.text))
        }
        return ChatResponse(generations)
    }

    // --- Internal POJOs for JSON Serialization ---

    data class AnthropicVertexRequest(
        val anthropic_version: String,
        @field:JsonInclude(JsonInclude.Include.NON_NULL)
        val system: String?,
        val messages: List<Message>,
        val max_tokens: Int,
        @field:JsonInclude(JsonInclude.Include.NON_NULL)
        val temperature: Double?,
        @field:JsonInclude(JsonInclude.Include.NON_NULL)
        val top_k: Int?,
        @field:JsonInclude(JsonInclude.Include.NON_NULL)
        val top_p: Double?
    ) {
        data class Message(val role: String, val content: String)
    }

    data class AnthropicVertexResponse(
        val content: List<ContentBlock>,
        val id: String,
        val model: String,
        val role: String,
        val stop_reason: String,
        val type: String,
        val usage: Usage
    ) {
        data class ContentBlock(val text: String, val type: String)
        data class Usage(
            @JsonProperty("input_tokens") val inputTokens: Int,
            @JsonProperty("output_tokens") val outputTokens: Int
        )
    }
}

/**
 * Options class for VertexAI Anthropic chat model
 */
data class VertexAiAnthropicChatOptions(
    private val temperature: Double? = null,
    private val maxTokens: Int = 4096,
    private val topK: Int? = null,
    private val topP: Double? = null
) : org.springframework.ai.chat.prompt.ChatOptions {

    companion object {
        @JvmStatic
        fun builder() = Builder()
    }

    class Builder {
        private var temperature: Double? = null
        private var maxTokens: Int = 4096
        private var topK: Int? = null
        private var topP: Double? = null

        fun withTemperature(temp: Double?) = apply { this.temperature = temp }
        fun withMaxTokens(max: Int) = apply { this.maxTokens = max }
        fun withTopK(topK: Int?) = apply { this.topK = topK }
        fun withTopP(topP: Double?) = apply { this.topP = topP }
        fun build() = VertexAiAnthropicChatOptions(temperature, maxTokens, topK, topP)
    }

    override fun getModel(): String? = null
    override fun getFrequencyPenalty(): Double? = null
    override fun getMaxTokens(): Int = maxTokens
    override fun getPresencePenalty(): Double? = null
    override fun getStopSequences(): MutableList<String>? = null
    override fun getTemperature(): Double? = temperature
    override fun getTopK(): Int? = topK
    override fun getTopP(): Double? = topP

    @Suppress("UNCHECKED_CAST")
    override fun <T : org.springframework.ai.chat.prompt.ChatOptions> copy(): T {
        return VertexAiAnthropicChatOptions(temperature, maxTokens, topK, topP) as T
    }
}

Use

        // ... set up other necessary stuff        

        val modelEndpoint = "projects/$projectId/locations/$location/publishers/anthropic/models/$modelId"

        // Create default options
        // Note: Claude on Vertex AI doesn't allow both temperature and top_p
        val defaultOptions = VertexAiAnthropicChatOptions.builder()
            .withTemperature(null) // Set to null to omit from request
            .withMaxTokens(4096)
            .withTopP(null) // Set to null to omit from request
            .withTopK(null) // Set to null to omit from request
            .build()

        val model = VertexAiAnthropicChatModel(
            predictionServiceClient,
            objectMapper,
            modelEndpoint,
            defaultOptions
        )

        val chatClient = ChatClient.create(model)
        println("Testing Anthropic Claude on Vertex AI...")

        val response = chatClient.prompt()
            .user("What is the circumference of earth in meters?")
            .call()
            .content()

        println("Response: $response")

Note this is not a complete solution, but it works

Comment From: iwauo

I have built experimental support for this feature based on the existing ChatModel. I will submit a PR if this works for you. https://github.com/iwauo/spring-ai-model-vertex-ai-anthropic.

Installation

dependencies {
    implementation 'io.github.iwauo.spring-ai:spring-ai-model-vertex-ai-anthropic:0.1.0'
}
<dependency>
    <groupId>io.github.iwauo.spring-ai</groupId>
    <artifactId>spring-ai-model-vertex-ai-anthropic</artifactId>
    <version>0.1.0</version>
</dependency>

Configuration

# Google Cloud Project ID
spring.ai.vertex.ai.anthropic.project-id=your-project-id
# Google Cloud region (must support Anthropic models)
spring.ai.vertex.ai.anthropic.location=us-central

# Optional properties
# (Other properties supported by the `spring-ai-model-anthropic` module should work as well.)
spring.ai.vertex.ai.anthropic.chat.options.model=claude-sonnet-4@20250514
spring.ai.vertex.ai.anthropic.chat.options.temperature=0.7
spring.ai.vertex.ai.anthropic.chat.options.max-tokens=4096