The VectorStore interfaces covers two things: similarity search and DocumentWriter/add/accept/delete.

These are different concepts and a SimilaritySearchService or similarly named interface should be split out.

I currently have something I would like to expose as a VectorStore that is readonly and does not support the ingestion operations.

Comment From: rjrudin

I agree with this - I think it'll be common to implement similaritySearch without making use of the add/delete operations - i.e. there could be a completely separate system for getting data into the data store that implements similaritySearch.

Comment From: markpollack

Yes, I agree as well that we need some redesign here and also te ability to pass options to add/delete operations. There are also other operations besides similarity search that need to be exposed.

Comment From: rjrudin

@markpollack Let me know if this worth a separate ticket, happy to open one - I'd also like to provide additional context to a SearchRequest, such as additional query context that isn't part of the user's query. For example, a user may wish to engage a chatbot about crimes in a particular area defined by a bounding box. Under the hood, I want to use that bounding box to select a subset of records in a database and then do a similarity search on those records. That additional query context doesn't fit into a Filter.Expression - perhaps a metadata map on SearchRequest would be a reasonable extension point?

Comment From: ThomasVitale

As part of the new (experimental) Modular RAG features, a VectorStoreDocumentRetriever has been introduced to support search operations decoupled from the generic CRUD operations in the VectorStore API.

@johnsonr does that help with your use case?

The new DocumentRetriever API will support searching data not only from a vector store, but also from other types of sources, such as web search engines or knowledge graphs.

Docs: https://docs.spring.io/spring-ai/reference/api/retrieval-augmented-generation.html#_retrieval

Comment From: markpollack

@sobychacko thoughts on this.

Comment From: markpollack

Yep. We can retrofit a SimilaritySearchOperations interface (name TBD) to that it can be passed around to code without fear of that code adding anything.

Comment From: markpollack

@rjrudin The SearchRequest is specifically to feed in the three things that are required for a similarity search - the query string, topk to return and filter expression as a passthrough to execution via a vector db that supports similarity search. Your query that you mention seems orthogonal to that execution path/use case. It sounds more like a geo-query (great with mongodb) in a non-vector-store database and then some sort of custom ranking of those records (perhaps by closest to a point).