Summary
The OpenAI Responses API response handler in InstrumentationSemConv.tagOpenAIResponse() extracts output_tokens_details.reasoning_tokens as completion_reasoning_tokens (lines 144–150), but does not extract the sibling field input_tokens_details.cached_tokens. This means prompt caching usage on Responses API calls is invisible in Braintrust metrics, even though the code already demonstrates the pattern for extracting nested token details at this level.
This is distinct from #58 (which covers Chat Completions prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens) — the Responses API uses different field names (input_tokens_details/output_tokens_details instead of prompt_tokens_details/completion_tokens_details) and a different code path.
What is missing
In InstrumentationSemConv.tagOpenAIResponse() (lines 144–150), only output_tokens_details is checked:
// Reasoning tokens (Responses API)
if (usage.has("output_tokens_details")) {
JsonNode details = usage.get("output_tokens_details");
if (details.has("reasoning_tokens")) {
metrics.put("completion_reasoning_tokens", details.get("reasoning_tokens"));
}
}
The missing extraction:
if (usage.has("input_tokens_details")) {
JsonNode details = usage.get("input_tokens_details");
if (details.has("cached_tokens")) {
metrics.put("prompt_cached_tokens", details.get("cached_tokens"));
}
}
A real Responses API usage object with prompt caching looks like:
{
"input_tokens": 9708,
"output_tokens": 167,
"total_tokens": 9875,
"input_tokens_details": {
"cached_tokens": 5578
},
"output_tokens_details": {
"reasoning_tokens": 0
}
}
Today, reasoning_tokens is captured but cached_tokens is silently dropped.
Braintrust docs status
- The Braintrust OpenAI integration docs at https://www.braintrust.dev/docs/integrations/ai-providers/openai do not specifically mention cached token metrics for the Responses API: not_found
- The Gemini integration docs capture
prompt_cached_tokens as a named metric, suggesting this is a recognized metric name in the Braintrust ecosystem
Upstream sources
- OpenAI Responses API: The
usage object includes input_tokens_details.cached_tokens — confirmed in community discussion and OpenAI prompt caching docs
- OpenAI Java SDK: The
Response object's Usage class exposes inputTokensDetails() with cachedTokens()
Local files inspected
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java — lines 135–150 (tagOpenAIResponse Responses API usage handling; output_tokens_details.reasoning_tokens extracted at line 148, no input_tokens_details check)
braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/dev/braintrust/instrumentation/openai/v2_8_0/BraintrustOpenAITest.java — testWrapOpenAiResponses does not assert cached token metrics
braintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java — line 145 shows prompt_cached_tokens is an established metric name (used for Gemini's cachedContentTokenCount)
Summary
The OpenAI Responses API response handler in
InstrumentationSemConv.tagOpenAIResponse()extractsoutput_tokens_details.reasoning_tokensascompletion_reasoning_tokens(lines 144–150), but does not extract the sibling fieldinput_tokens_details.cached_tokens. This means prompt caching usage on Responses API calls is invisible in Braintrust metrics, even though the code already demonstrates the pattern for extracting nested token details at this level.This is distinct from #58 (which covers Chat Completions
prompt_tokens_details.cached_tokensandcompletion_tokens_details.reasoning_tokens) — the Responses API uses different field names (input_tokens_details/output_tokens_detailsinstead ofprompt_tokens_details/completion_tokens_details) and a different code path.What is missing
In
InstrumentationSemConv.tagOpenAIResponse()(lines 144–150), onlyoutput_tokens_detailsis checked:The missing extraction:
A real Responses API usage object with prompt caching looks like:
{ "input_tokens": 9708, "output_tokens": 167, "total_tokens": 9875, "input_tokens_details": { "cached_tokens": 5578 }, "output_tokens_details": { "reasoning_tokens": 0 } }Today,
reasoning_tokensis captured butcached_tokensis silently dropped.Braintrust docs status
prompt_cached_tokensas a named metric, suggesting this is a recognized metric name in the Braintrust ecosystemUpstream sources
usageobject includesinput_tokens_details.cached_tokens— confirmed in community discussion and OpenAI prompt caching docsResponseobject'sUsageclass exposesinputTokensDetails()withcachedTokens()Local files inspected
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java— lines 135–150 (tagOpenAIResponseResponses API usage handling;output_tokens_details.reasoning_tokensextracted at line 148, noinput_tokens_detailscheck)braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/dev/braintrust/instrumentation/openai/v2_8_0/BraintrustOpenAITest.java—testWrapOpenAiResponsesdoes not assert cached token metricsbraintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java— line 145 showsprompt_cached_tokensis an established metric name (used for Gemini'scachedContentTokenCount)