[PECOBLR-1928] Add AI coding agent detection to User-Agent header#332
[PECOBLR-1928] Add AI coding agent detection to User-Agent header#332vikrantpuppala wants to merge 1 commit intodatabricks:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds AI coding-agent attribution to outbound HTTP User-Agent headers by detecting known agent-specific environment variables and appending an agent/<product> token to the existing connector User-Agent.
Changes:
- Added
lib/utils/agentDetector.tsto detect exactly one known agent via env vars. - Updated
buildUserAgentString()to appendagent/<product>when detected. - Expanded unit test regex (and added new detector tests) to accommodate/validate agent detection behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lib/utils/agentDetector.ts | Introduces env-var-based agent detection with “exactly one” rule. |
| lib/utils/buildUserAgentString.ts | Appends agent/<product> to the generated User-Agent string. |
| tests/unit/utils/agentDetector.test.ts | Adds unit coverage for detection outcomes across agents and edge cases. |
| tests/unit/utils/utils.test.ts | Updates User-Agent regex to allow an optional agent/<product> suffix. |
Comments suppressed due to low confidence (2)
tests/unit/utils/utils.test.ts:35
- The inline spec/commentary for the User-Agent format above
checkUserAgentString()still describes the header as ending at the closing)of the comment, but the regex now allows an optionalagent/<product>suffix. Update the test documentation/examples to reflect the new optional product token so the comments stay consistent with what the code accepts.
function checkUserAgentString(ua: string, userAgentEntry?: string) {
// Prefix: 'NodejsDatabricksSqlConnector/'
// Version: three period-separated digits and optional suffix
const re =
/^(?<productName>NodejsDatabricksSqlConnector)\/(?<productVersion>\d+\.\d+\.\d+(-[^(]+)?)\s*\((?<comment>[^)]+)\)(\s+agent\/[a-z-]+)?$/i;
lib/utils/agentDetector.ts:8
- The docstring says an agent env var being "present" is sufficient for detection, but the implementation treats only truthy/non-empty values as present (
env[a.envVar]). Consider clarifying the comment to say the variable must be set to a non-empty value (to match the behavior and tests).
* Detection only succeeds when exactly one agent environment variable is present,
* to avoid ambiguous attribution when multiple agent environments overlap.
*
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let ua = `${productName}/${packageVersion} (${extra.join('; ')})`; | ||
|
|
||
| const agentProduct = detectAgent(); | ||
| if (agentProduct) { | ||
| ua += ` agent/${agentProduct}`; | ||
| } |
There was a problem hiding this comment.
The new behavior of appending agent/<product> to the User-Agent isn't directly asserted anywhere. The updated regex makes existing tests pass even when the suffix is present, but it won't catch regressions where the suffix is missing or malformed when an agent env var is set. Add a unit test for buildUserAgentString() that sets a known agent env var (and cleans it up) and asserts the suffix is appended as expected.
There was a problem hiding this comment.
Good catch — added a dedicated test (appends agent suffix when agent env var is set) in the latest push that sets CLAUDECODE=1, calls buildUserAgentString(), and asserts the output includes agent/claude-code.
Detect when the Node.js SQL driver is invoked by an AI coding agent (e.g. Claude Code, Cursor, Gemini CLI) by checking well-known environment variables, and append `agent/<product>` to the User-Agent string. This enables Databricks to understand how much driver usage originates from AI coding agents. Detection only succeeds when exactly one agent is detected to avoid ambiguous attribution. Mirrors the approach in databricks/cli#4287. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
661181f to
7fc52b6
Compare
|
Recreating from databricks org branch to fix CI permissions. |
Summary
agentDetector.tsmodule that detects 7 AI coding agents (Claude Code, Cursor, Gemini CLI, Cline, Codex, OpenCode, Antigravity) by checking well-known environment variables they set in spawned shell processesbuildUserAgentString()to appendagent/<product>to the User-Agent headerApproach
Mirrors the implementation in databricks/cli#4287 and aligns with the latest agent list in
libs/agent/agent.go.antigravityANTIGRAVITY_AGENTclaude-codeCLAUDECODEclineCLINE_ACTIVEcodexCODEX_CIcursorCURSOR_AGENTgemini-cliGEMINI_CLIopencodeOPENCODEAdding a new agent requires only a new entry in the
knownAgentsarray.Changes
lib/utils/agentDetector.ts— environment-variable-based agent detection with injectable env object for testabilitylib/utils/buildUserAgentString.ts— callsdetectAgent()and appendsagent/<product>to the User-Agent stringtests/unit/utils/utils.test.ts— updated User-Agent regex to allow optionalagent/<product>suffixtests/unit/utils/agentDetector.test.ts— 11 test cases covering all agents, no agent, multiple agents, empty/undefined valuesTest plan
agentDetector.test.ts— 11 tests passutils.test.ts(buildUserAgentString) — all 3 existing tests continue to passagent/claude-codewhen run from Claude Code viaNODE_DEBUG=httpSELECT 1successfully against dogfood warehouse:[{"1":1}]🤖 Generated with Claude Code