Skip to content

Google GenAI generateContentStream aggregation silently drops inlineData output parts (images, audio) #1690

@braintrust-bot

Description

@braintrust-bot

Summary

The Google GenAI streaming aggregation code drops inlineData parts from the response output. When Gemini models generate images natively via generateContent with responseModalities: ['IMAGE'] (or return audio/other binary data), the inlineData parts in streamed chunks are silently lost because the aggregation loop does not handle them.

Non-streaming generateContent calls are unaffected — the plugin logs the full raw response as output.

What instrumentation is missing

In js/src/instrumentation/plugins/google-genai-plugin.ts, the aggregateGenerateContentChunks function (lines 749–778) processes parts from streamed chunks:

for (const part of candidate.content.parts) {
  if (part.text !== undefined) {
    // handled ✓
  } else if (part.functionCall) {
    // handled ✓
  } else if (part.codeExecutionResult) {
    // handled ✓
  } else if (part.executableCode) {
    // handled ✓
  }
  // inlineData → falls through, silently dropped ✗
}

The vendored type GoogleGenAIPart in js/src/vendor-sdk-types/google-genai.ts already declares the inlineData field (line 53), but the aggregation code never handles it. Any inlineData part in a streamed chunk is silently excluded from the aggregated output span.

Impact

  • Native image generation via Gemini models (gemini-2.0-flash, etc.) with streaming produces spans where generated images are missing from the output
  • Braintrust docs state "Streaming responses are fully supported — Braintrust automatically collects streamed chunks and logs the complete response as a single span," but this is not the case for image/audio output
  • Users who stream generateContent calls with responseModalities: ['IMAGE', 'TEXT'] will see text in their spans but not the generated images

Braintrust docs status

unclear — Braintrust docs at https://www.braintrust.dev/docs/instrument/wrap-providers list @google/genai as supported and claim full streaming support, but do not specifically address image output in streamed responses.

Upstream reference

  • Google GenAI native image generation: https://ai.google.dev/gemini-api/docs/image-generation
  • generateContent with responseModalities: ['IMAGE'] returns inlineData parts containing generated images
  • This is a stable feature available on Gemini 2.0 Flash and later models

Local files inspected

  • js/src/instrumentation/plugins/google-genai-plugin.ts (lines 749–778: aggregateGenerateContentChunks part processing loop)
  • js/src/vendor-sdk-types/google-genai.ts (line 53: inlineData field on GoogleGenAIPart)
  • js/src/wrappers/google-genai.ts (wrapper proxies generateContentStream to channel)
  • e2e/scenarios/google-genai-instrumentation/ (no test cases with image output in streamed responses)

Note

This is distinct from #1673 (models.generateImages() not instrumented), which covers the dedicated Imagen API. This issue is about the standard generateContent/generateContentStream API producing image output that gets lost specifically in the streaming aggregation path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions