Bug description When using Spring AI's OpenAI embeddings in a reactive Spring Boot application, the vector store operations (specifically similaritySearch) are blocking operations that run on the reactive event loop thread, causing IllegalStateException: block()/blockFirst()/blockLast() are blocking, which is not supported in thread reactor-http-nio-3. This is similar to issue #2533 but specifically for vector store operations.

Environment - Spring AI version: 1.0.0-M6 - Spring Boot version: 3.4.5 (happening from 3.4.x) - Java version: 17 (based on the stack trace) - Vector Store: Pinecone - Dependencies: - spring-ai-openai - spring-ai-pinecone-store - spring-ai-core - reactor-core 3.7.5 - reactor-netty-core 1.2.5

Steps to reproduce 1. Create a reactive Spring Boot application 2. Configure Spring AI with OpenAI embeddings 3. Set up Pinecone vector store 4. Create a reactive endpoint that uses vector store operations 5. Make a request to the endpoint 6. The application will throw an exception because the blocking vector store operations are running on the reactive event loop thread

Expected behavior The vector store operations should either: 1. Be implemented using reactive programming patterns (using WebClient instead of RestTemplate) 2. Or automatically run on a separate thread pool without requiring manual intervention 3. Or provide a clear indication in the documentation that these operations are blocking and need to be handled accordingly

Minimal Complete Reproducible example

@Service
public class EmbeddingService {
    private final VectorStore vectorStore;

    public EmbeddingService(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    public Mono<String> getContext(String prompt) {
        return Mono.fromCallable(() -> {
            List<Document> docs = vectorStore.similaritySearch(prompt); // This blocks on the reactive thread
            return docs.stream()
                .map(Document::getText)
                .collect(Collectors.joining());
        });
    }
}

@RestController
public class EmbeddingController {
    private final EmbeddingService service;

    public EmbeddingController(EmbeddingService service) {
        this.service = service;
    }

    @GetMapping("/context")
    public Mono<String> getContext(@RequestParam String prompt) {
        return service.getContext(prompt);
    }
}

Configuration:

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
    vectorstore:
      pinecone:
        api-key: ${PINECONE_API_KEY}
        project-id: your-project-id
        environment: your-environment
        index-name: your-index

The current workaround is to manually move the blocking operation to a separate thread pool using subscribeOn(Schedulers.boundedElastic()), but this should be handled by the framework itself.