The VectorStore
interfaces covers two things: similarity search and DocumentWriter
/add/accept/delete.
These are different concepts and a SimilaritySearchService
or similarly named interface should be split out.
I currently have something I would like to expose as a VectorStore
that is readonly and does not support the ingestion operations.
Comment From: rjrudin
I agree with this - I think it'll be common to implement similaritySearch
without making use of the add/delete operations - i.e. there could be a completely separate system for getting data into the data store that implements similaritySearch
.
Comment From: markpollack
Yes, I agree as well that we need some redesign here and also te ability to pass options to add/delete operations. There are also other operations besides similarity search that need to be exposed.
Comment From: rjrudin
@markpollack Let me know if this worth a separate ticket, happy to open one - I'd also like to provide additional context to a SearchRequest
, such as additional query context that isn't part of the user's query. For example, a user may wish to engage a chatbot about crimes in a particular area defined by a bounding box. Under the hood, I want to use that bounding box to select a subset of records in a database and then do a similarity search on those records. That additional query context doesn't fit into a Filter.Expression
- perhaps a metadata map on SearchRequest
would be a reasonable extension point?
Comment From: ThomasVitale
As part of the new (experimental) Modular RAG features, a VectorStoreDocumentRetriever has been introduced to support search operations decoupled from the generic CRUD operations in the VectorStore API.
@johnsonr does that help with your use case?
The new DocumentRetriever API will support searching data not only from a vector store, but also from other types of sources, such as web search engines or knowledge graphs.
Docs: https://docs.spring.io/spring-ai/reference/api/retrieval-augmented-generation.html#_retrieval
Comment From: markpollack
@sobychacko thoughts on this.