Bug description When mapping similarity search results to Document, the code injects a distance field into a JsonObject read from the row’s metadata. If rowRecord.get(metadataFieldName) returns null, the subsequent call to metadata.addProperty(...) throws an NPE. The code currently catches only ParamException, so a NullPointerException escapes the try/catch.

Stack trace (excerpt):

java.lang.NullPointerException: Cannot invoke "com.google.gson.JsonObject.addProperty(String, java.lang.Number)" because "metadata" is null
    at org.springframework.ai.vectorstore.milvus.MilvusVectorStore.lambda$similaritySearch$2(...)
    at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:...)
    at org.springframework.ai.vectorstore...ObservationAwareMilvusVectorStore.lambda$similaritySearch$7(...)
    at ai.shb.shbai.service.DocumentService.searchSimilarDocuments(DocumentService.java:397)
    at ai.shb.shbai.controller.DocumentController.searchDocuments(DocumentController.java:36)

Environment Spring AI version: 1.0.0 Java version: 17.0.14 Vector store: MilvusVectorStore Gson version: 2.13.1 Module / file / class: org.springframework.ai.vectorstore.milvus.MilvusVectorStore

Steps to reproduce 1. Ensure a search result row where the metadata field exists but its value is null (common with legacy/custom data).

  1. Run similarity search and map rows to Document with code like:
JsonObject metadata = (JsonObject) rowRecord.get(this.metadataFieldName); // may be null
metadata.addProperty(DocumentMetadata.DISTANCE.value(), 1 - getResultSimilarity(rowRecord)); // NPE here

Expected behavior Missing/null metadata should be handled gracefully (use an empty metadata object/map as fallback). distance is always injected.

Minimal Complete Reproducible example Self-contained snippet that reproduces the failure without external deps:

import com.google.gson.*;
import com.google.gson.reflect.TypeToken;
import java.lang.reflect.Type;
import java.util.Map;

public class MinimalRepo {

    static Map<String, Object> mapRecord(Object rawMetadata, double similarity) {
        JsonObject metadata = (JsonObject) rawMetadata; // may be null
        // This line throws when metadata == null
        metadata.addProperty("distance", 1.0 - similarity);
        Type t = new TypeToken<Map<String, Object>>(){}.getType();
        return new Gson().fromJson(metadata, t);
    }

    public static void main(String[] args) {
        System.out.println("Testing with null metadata...");
        try {
            mapRecord(null, 0.8); // -> NullPointerException
        } catch (NullPointerException e) {
            System.out.println("✓ NullPointerException caught: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Comment From: onestardao

It looks like the root cause here is that when the metadata field is null during similarity search mapping, the distance injection logic does not guard against nulls, leading to a NullPointerException. This typically happens with legacy or incomplete vector store data where the metadata object is absent.

If you’d like, I can share a tested approach that gracefully handles missing metadata and prevents this crash — just let me know and I can provide the reference.

Comment From: little-huang

It looks like the root cause here is that when the metadata field is null during similarity search mapping, the distance injection logic does not guard against nulls, leading to a NullPointerException. This typically happens with legacy or incomplete vector store data where the metadata object is absent.

If you’d like, I can share a tested approach that gracefully handles missing metadata and prevents this crash — just let me know and I can provide the reference.

Yes, this issue occurs when the vector database index does not include the metadata field. It works normally when the metadata field is added and contains data. 🙌

Comment From: onestardao

Got it that makes perfect sense. Here’s the relevant section from our Problem Map (MIT-licensed, used by 80+ devs so far): https://github.com/onestardao/WFGY/blob/main/ProblemMap/retrieval-traceability.md

Quick steps:

Download TXTOS.

Ask your AI, inside the TXTOS session, “How do I apply the WFGY guardrail for missing metadata in distance injection?”

Follow the suggested patch — no infra changes needed, works as a semantic firewall that gracefully skips nulls.

We’ve seen this fix prevent the exact crash you described without touching the rest of your pipeline. Let me know if you want me to walk through a live example.