Please do a quick search on GitHub issues first, the feature you are about to request might have already been requested.

Expected Behavior

VertexAiGeminiChatModel passes RagCorpus as Retrieval.

Current Behavior

Now, VertexAiGeminiChatModel.createGeminiRequest() supports functions definitions, and GoogleSearch only.

Context

How has this issue affected you? There seems to be no way to use Rag Corpus in GCP

What are you trying to accomplish? I am trying to make the call with

"retrieval": {
        "vertexRagStore": {
          "ragResources": [
            {
              "ragCorpus": "projects/{project_id}/locations/{localtion}/ragCorpora/{rag_corpus_id}"
            }
          ],
          "ragRetrievalConfig": {
            "filter": {
              "vectorSimilarityThreshold": 0.5
            },
            "topK": 3
          }
        }
      }

What other alternatives have you considered? I have considered using native google genai SDK

Are you aware of any workarounds?

Adding the following lines will make it work.

line 714- in VertexAiGeminiChatModel.java

        if (prompt.getOptions() instanceof VertexAiGeminiChatOptions options && !StringUtil.isNullOrEmpty(options.getRagCorpus())) {
            final Tool ragCorpusTool = Tool.newBuilder()
                    .setRetrieval(
                            Retrieval.newBuilder().setVertexRagStore(VertexRagStore.newBuilder()
                                    .addRagResources(
                                            0,
                                            VertexRagStore.RagResource.newBuilder()
                                                    .setRagCorpus(options.getRagCorpus())
                                                    .build()
                                    ).build()
                            ).build()
                    ).build();
            tools.add(ragCorpusTool);
        }

line 152- in VertexAiGeminiChatOptions.java

/**
     * Use Google RAG Corpus feature
     */
    @JsonIgnore
    private String ragCorpus;

...
// add Getter and Setter
...

This works with other tools correctly.

Comment From: onestardao

we've hit this too. corpus is technically accepted, but semantically bypassed meaning no matter what you pass in, Gemini doesn’t really use it.

we mapped this as Problem No.5 in our diagnostic set, and built a semantic verification patch that checks not just ingestion but activation.

if you're curious I can share how we detected and fixed it (works across GCP, Claude, and OpenAI variants too).

Comment From: harry-yoo-eb

@onestardao Could you share it? Thanks a lot!

FYI: I've experienced that tools at the same level as contents are ignored by gemini unlike their spec. (e.g.1) But, Having the tools under contents works and gemini accepts corpus. (e.g.2) Please try. Hope this helps.

e.g.1 (Not working) gemini ignores it.

{
  "contents": [
    {
      "parts": [
        {
          "text": "Tell me a joke"
        }
      ],
      "role": "user"
    }
  ],
  "tools": [
    {
      "retrieval": {
        "vertexRagStore": {
          "ragResources": [
            {
              "ragCorpus": "projects/..."
            }
          ],
          "ragRetrievalConfig": {
            "filter": {
              "vectorSimilarityThreshold": 0.5
            },
            "topK": 3
          }
        }
      }
    }
  ],
  "generationConfig": {}
}

e.g.2 (Working) gemini accepts and corpus works. (Also, function tools & corpus both works together)

{
  "contents": [
    {
      "parts": [
        {
          "text": "Tell me a joke"
        }
      ],
      "role": "user",
      "tools": [
        {
            "retrieval": {
              "vertexRagStore": {
                "ragResources": [
                  {
                    "ragCorpus": "projects/..."
                  }
                ],
                "ragRetrievalConfig": {
                  "filter": {
                    "vectorSimilarityThreshold": 0.5
                  },
                  "topK": 3
                }
              }
            }
          }
        ],
    }
  ],
  "generationConfig": {}
}

e.g.2 works with my code suggestion in my original content above in this issue.

Comment From: onestardao

sure — here’s the public ProblemMap entry (No.5) with the detection logic and semantic verification patch:
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md

we use it to catch ingestion-but-no-activation cases before runtime, so you can confirm the corpus is actually being semantically accepted.
works across gcp, claude, and openai variants too.