Please do a quick search on GitHub issues first, there might be already a duplicate issue for the one you are about to create. If the bug is trivial, just go ahead and create the issue. Otherwise, please take a few moments and fill in the following sections:
Bug description
QdrantVectorStore
throws a NullPointerException when adding a Document
that contains media only (no text).
Environment spring-ai-bom:1.0.0 spring-ai-starter-vector-store-qdrant Java : 24 springboot version : 3.5.0 spring dependency management : version 1.1.7
Steps to reproduce
1. Run Qdrant locally (docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
).
2. Create a Spring Boot app with spring-ai-starter-vector-store-qdrant
.
3. Inside a test or service method:
Media media = new Media(
MimeType.valueOf("image/png"),
new byte[] { 0x00 }); // 1×1 transparent pixel
Document imgDoc = Document.builder()
.media(media) // ⚠️ no text
.metadata(Map.of("fileName", "pixel.png"))
.build();
vectorStore.add(List.of(imgDoc)); // ← NPE here
```
<img width="500" alt="Image" src="https://github.com/user-attachments/assets/f367e2e6-a943-49bf-9691-081adbcf52eb" />
```
Observe the stack trace:
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at org.springframework.ai.vectorstore.qdrant.QdrantVectorStore.toPayload(QdrantVectorStore.java:304)
at org.springframework.ai.vectorstore.qdrant.QdrantVectorStore.lambda$doAdd$0(QdrantVectorStore.java:186)
```
**Expected behavior**
QdrantVectorStore should accept media-only documents and store their vector plus metadata without requiring doc_content to be present.
**Minimal Complete Reproducible example**
When a Document instance contains only media (no text) and is passed to QdrantVectorStore.add(...), the call fails with a NullPointerException.
The failure originates in QdrantVectorStore.toPayload(Document):
```java
var payload = QdrantValueFactory.toValueMap(document.getMetadata());
payload.put(CONTENT_FIELD_NAME,
io.qdrant.client.ValueFactory.value(document.getText())); // ← getText() is null
Because Document was refactored to be either text or media (never both), getText() legitimately returns null for image-only documents. The method still unconditionally adds the "doc_content" field, so a media document triggers the NPE during ValueFactory.value(null).
Typical flow that exposes the bug:
My custom EmbeddingModel converts each incoming image to a float-array vector via a FastAPI CLIP endpoint.
I build a list of media-only Document objects, one per uploaded file.
I inject VectorStore (backed by QdrantVectorStore) and call vectorStore.add(documents).
doAdd(...) calls toPayload(document) for every item; the very first image document crashes with the NPE, aborting the entire batch.
The problem is independent of the embedding model: it happens before the vectors are sent to Qdrant, purely because the payload builder assumes text content is always present.
Comment From: dev-jonghoonpark
The Media class has two public constructors:
public Media(MimeType mimeType, URI uri)
public Media(MimeType mimeType, Resource resource)
How were you able to use a byte[]
as the second argument in your example?
Comment From: dev-jonghoonpark
I think your use case falls outside the expected scope of usage currently supported by Spring AI.
If you use a custom embedding model to convert images to vectors, wouldn't it be better to use the QdrantClient
directly to add and the vector data?
What do you think?
Comment From: devMtn30
The Media class has two public constructors:
public Media(MimeType mimeType, URI uri)
public Media(MimeType mimeType, Resource resource)
How were you able to use a
byte[]
as the second argument in your example?
@dev-jonghoonpark
Below is a snippet taken directly from my production codebase for context:
@Service
@RequiredArgsConstructor
public class ImageService {
private final VectorStore vectorStore;
public void store(List<MultipartFile> files) {
try {
List<Document> documents = toDocumentList(files);
vectorStore.add(documents);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
public List<Document> search(String query, int k) {
return vectorStore.similaritySearch(SearchRequest.builder()
.query(query)
.topK(k)
.build());
}
private List<Document> toDocumentList(List<MultipartFile> files) throws IOException {
return files.stream().map(
file -> Document.builder()
.media(new Media(MediaType.IMAGE_PNG, file.getResource()))
.metadata(Map.of("filename", Objects.requireNonNull(file.getOriginalFilename())))
//.text(file.getOriginalFilename())
//.idGenerator(new RandomIdGenerator())
.build()
).collect(Collectors.toList());
}
}
Comment From: devMtn30
Hi Spring AI team,
While integrating QdrantVectorStore
I noticed that toPayload(Document document)
always inserts document.getText()
into the payload—even when the incoming Document
is a Media
object whose text is null
:
private Map<String, Value> toPayload(Document document) {
try {
var payload = QdrantValueFactory.toValueMap(document.getMetadata());
payload.put(CONTENT_FIELD_NAME,
io.qdrant.client.ValueFactory.value(document.getText()));
return payload;
} catch (Exception e) {
throw new RuntimeException(e);
}
}
This can result in a null
value (or even an NPE, depending on the Qdrant client) under CONTENT_FIELD_NAME
, which breaks downstream similarity-search logic.
Would it make sense to guard this with something like
if (document.isText()) {
payload.put(CONTENT_FIELD_NAME,
io.qdrant.client.ValueFactory.value(document.getText()));
}
—or populate it with a sensible default—so non-text Document
s are handled safely?
Happy to submit a PR if that approach sounds reasonable. Let me know what you think!
Thanks!