Skip to content

Google GenAI groundingMetadata not captured in streaming aggregation or span metadata #1700

@braintrust-bot

Description

@braintrust-bot

Summary

When Google Search grounding is enabled via tools: [{ googleSearch: {} }], the @google/genai SDK returns groundingMetadata on the response containing search citations, source URIs, and confidence scores. The current Google GenAI instrumentation plugin does not capture this metadata in either the streaming aggregation or the span metadata, so grounding details are silently lost in traced spans.

Non-streaming calls pass through the raw response object as output, so groundingMetadata is incidentally preserved there — but it is never extracted into span metadata where it would be queryable, and it is completely lost in streaming.

What is missing

  • Streaming aggregation (js/src/instrumentation/plugins/google-genai-plugin.ts, aggregateGenerateContentChunks): Only extracts candidates, usageMetadata, text, functionCall, codeExecutionResult, executableCode, and thought from chunks. groundingMetadata is not accumulated or forwarded.
  • Metadata extraction (extractMetadata): Only captures model, config, and tools from the request params. Does not extract groundingMetadata from the response.
  • Vendor SDK types (js/src/vendor-sdk-types/google-genai.ts): GoogleGenAIGenerateContentResponse has no explicit groundingMetadata field (only a catch-all [key: string]: unknown).
  • E2E tests: No scenario uses googleSearch tool configuration or validates grounding metadata.

Upstream reference

  • Google AI Gemini grounding docs: https://ai.google.dev/gemini-api/docs/grounding
  • groundingMetadata response fields include:
    • searchEntryPoint — rendered content for the search widget
    • groundingChunks — array of { web: { uri, title } } source documents
    • webSearchQueries — the search queries the model issued
    • groundingSupports — text segments with confidence scores and chunk indices
  • Available on models like gemini-2.0-flash and gemini-2.5-pro when grounding is enabled.

Braintrust docs status

The Braintrust Google GenAI integration page documents generateContent and generateContentStream but does not mention grounding metadata (not_found).

Precedent in this repo

The Python SDK has an equivalent open issue: braintrustdata/braintrust-sdk-python#153.

Local files inspected

  • js/src/instrumentation/plugins/google-genai-plugin.ts — streaming aggregation and metadata extraction
  • js/src/instrumentation/plugins/google-genai-channels.ts — channel definitions
  • js/src/vendor-sdk-types/google-genai.ts — response type definitions
  • js/src/wrappers/google-genai.ts — wrapper proxy
  • e2e/scenarios/google-genai-instrumentation/scenario.impl.mjs — e2e test scenarios

Metadata

Metadata

Labels

bot-automationIssues generated by an agent automation

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions