Summary
The OpenAI instrumentation's SSE stream reassembly path hardcodes ChatCompletionAccumulator / ChatCompletionChunk, which is specific to the Chat Completions API. When a user calls client.responses().createStreaming(...), the Responses API emits a different set of SSE event types (response.created, response.output_text.delta, response.completed, etc.) that cannot be deserialized as ChatCompletionChunk. As a result, streaming Responses API spans end up with no output data and no usage metrics.
Non-streaming Responses API calls (client.responses().create(...)) work correctly — the InstrumentationSemConv class already handles input/output/input_tokens/output_tokens fields.
What is missing
TracingHttpClient.tagSpanFromSseBytes() (lines 205–231) needs a parallel code path that:
- Detects Responses API SSE events (which may start with
event: lines like event: response.created before the data: line, unlike Chat Completions which only has data: lines).
- Uses the OpenAI Java SDK's
ResponseAccumulator (analogous to ChatCompletionAccumulator) to reassemble ResponseStreamEvent chunks into a complete Response object.
- Passes the assembled response JSON through the existing
InstrumentationSemConv.tagOpenAIResponse() which already knows how to extract output, input_tokens, output_tokens, and reasoning_tokens from Responses API payloads.
Failure mode
Today when Responses API streaming is used:
- If the first non-empty SSE line is
event: response.created (not data:), the code at line 176 falls through to the plain-JSON branch, which tries to parse the entire SSE byte stream as JSON → parse error → span has no output/metrics.
- Even if a
data: line happened to come first, line 218–219 would attempt BraintrustJsonMapper.get().readValue(data, ChatCompletionChunk.class) on a Responses API event object → deserialization error → span has no output/metrics.
In both cases the error is caught and logged, but the span is silently incomplete.
Braintrust docs status
- The Java SDK README and Braintrust docs do not explicitly document Responses API streaming support: not_found
- Non-streaming Responses API is handled in code but not documented either.
Upstream sources
Local files inspected
braintrust-sdk/instrumentation/openai_2_8_0/src/main/java/dev/braintrust/instrumentation/openai/v2_8_0/TracingHttpClient.java — lines 205–231 (tagSpanFromSseBytes hardcodes ChatCompletionAccumulator)
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java — lines 99–104, 116–150 (correctly handles Responses API fields for non-streaming)
braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/.../BraintrustOpenAITest.java — has testWrapOpenAiResponses (non-streaming only, no streaming Responses API test)
Summary
The OpenAI instrumentation's SSE stream reassembly path hardcodes
ChatCompletionAccumulator/ChatCompletionChunk, which is specific to the Chat Completions API. When a user callsclient.responses().createStreaming(...), the Responses API emits a different set of SSE event types (response.created,response.output_text.delta,response.completed, etc.) that cannot be deserialized asChatCompletionChunk. As a result, streaming Responses API spans end up with no output data and no usage metrics.Non-streaming Responses API calls (
client.responses().create(...)) work correctly — theInstrumentationSemConvclass already handlesinput/output/input_tokens/output_tokensfields.What is missing
TracingHttpClient.tagSpanFromSseBytes()(lines 205–231) needs a parallel code path that:event:lines likeevent: response.createdbefore thedata:line, unlike Chat Completions which only hasdata:lines).ResponseAccumulator(analogous toChatCompletionAccumulator) to reassembleResponseStreamEventchunks into a completeResponseobject.InstrumentationSemConv.tagOpenAIResponse()which already knows how to extractoutput,input_tokens,output_tokens, andreasoning_tokensfrom Responses API payloads.Failure mode
Today when Responses API streaming is used:
event: response.created(notdata:), the code at line 176 falls through to the plain-JSON branch, which tries to parse the entire SSE byte stream as JSON → parse error → span has no output/metrics.data:line happened to come first, line 218–219 would attemptBraintrustJsonMapper.get().readValue(data, ChatCompletionChunk.class)on a Responses API event object → deserialization error → span has no output/metrics.In both cases the error is caught and logged, but the span is silently incomplete.
Braintrust docs status
Upstream sources
ResponseAccumulator: available incom.openai:openai-java(the SDK already depended on at[2.8.0,)), analogous toChatCompletionAccumulatorResponsesStructuredOutputsStreamingExample.javaLocal files inspected
braintrust-sdk/instrumentation/openai_2_8_0/src/main/java/dev/braintrust/instrumentation/openai/v2_8_0/TracingHttpClient.java— lines 205–231 (tagSpanFromSseByteshardcodesChatCompletionAccumulator)braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java— lines 99–104, 116–150 (correctly handles Responses API fields for non-streaming)braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/.../BraintrustOpenAITest.java— hastestWrapOpenAiResponses(non-streaming only, no streaming Responses API test)