Currently, PgVectorStore requires specific column names (id, content, metadata, embedding) in the database schema. However, in real-world scenarios, database administrators may have different naming conventions or existing schemas that serve the same purpose but use different column names.
We should support column name mapping similar to ORM frameworks, allowing PgVectorStore to adapt to existing database schemas rather than requiring schema changes to match our names.
Example scenarios:
- 'id' might be 'document_id' or 'uuid'
- 'content' might be 'document_text' or 'body'
- 'metadata' might be 'document_metadata' or 'attributes'
- 'embedding' might be 'vector_embedding' or 'document_vector'
Proposed Solution:
PgVectorStore.builder()
.columnMapping(ColumnMapping.builder()
.idColumn("document_id") // maps to our 'id'
.contentColumn("body") // maps to our 'content'
.metadataColumn("attributes") // maps to our 'metadata'
.embeddingColumn("document_vector") // maps to our 'embedding'
.build())
This would:
- Enable integration with existing database schemas
- Respect organizational naming conventions
- Maintain all required functionality while being more flexible with column names
Comment From: markpollack
There is an approach for this used with the ChatMemoryRepository that could apply here.
https://github.com/spring-projects/spring-ai/pull/3055