Skip to content

Comments

[PECOBLR-1928] Add AI coding agent detection to User-Agent header#332

Closed
vikrantpuppala wants to merge 1 commit intodatabricks:mainfrom
vikrantpuppala:agent-detection
Closed

[PECOBLR-1928] Add AI coding agent detection to User-Agent header#332
vikrantpuppala wants to merge 1 commit intodatabricks:mainfrom
vikrantpuppala:agent-detection

Conversation

@vikrantpuppala
Copy link
Collaborator

Summary

  • Adds agentDetector.ts module that detects 7 AI coding agents (Claude Code, Cursor, Gemini CLI, Cline, Codex, OpenCode, Antigravity) by checking well-known environment variables they set in spawned shell processes
  • Integrates detection into buildUserAgentString() to append agent/<product> to the User-Agent header
  • Uses exactly-one detection rule: if zero or multiple agent env vars are set, no agent is attributed (avoids ambiguity)

Approach

Mirrors the implementation in databricks/cli#4287 and aligns with the latest agent list in libs/agent/agent.go.

Agent Product String Environment Variable
Google Antigravity antigravity ANTIGRAVITY_AGENT
Claude Code claude-code CLAUDECODE
Cline cline CLINE_ACTIVE
OpenAI Codex codex CODEX_CI
Cursor cursor CURSOR_AGENT
Gemini CLI gemini-cli GEMINI_CLI
OpenCode opencode OPENCODE

Adding a new agent requires only a new entry in the knownAgents array.

Changes

  • New: lib/utils/agentDetector.ts — environment-variable-based agent detection with injectable env object for testability
  • Modified: lib/utils/buildUserAgentString.ts — calls detectAgent() and appends agent/<product> to the User-Agent string
  • Modified: tests/unit/utils/utils.test.ts — updated User-Agent regex to allow optional agent/<product> suffix
  • New: tests/unit/utils/agentDetector.test.ts — 11 test cases covering all agents, no agent, multiple agents, empty/undefined values

Test plan

  • agentDetector.test.ts — 11 tests pass
  • utils.test.ts (buildUserAgentString) — all 3 existing tests continue to pass
  • Manual: verified User-Agent contains agent/claude-code when run from Claude Code via NODE_DEBUG=http
    'User-Agent': [
      'NodejsDatabricksSqlConnector/1.12.0 (Node.js 22.19.0; Linux 5.4.0-1154-aws-fips) agent/claude-code'
    
  • Executed SELECT 1 successfully against dogfood warehouse: [{"1":1}]

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings February 24, 2026 08:52
@vikrantpuppala vikrantpuppala changed the title Add AI coding agent detection to User-Agent header [PECOBLR-1928] Add AI coding agent detection to User-Agent header Feb 24, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds AI coding-agent attribution to outbound HTTP User-Agent headers by detecting known agent-specific environment variables and appending an agent/<product> token to the existing connector User-Agent.

Changes:

  • Added lib/utils/agentDetector.ts to detect exactly one known agent via env vars.
  • Updated buildUserAgentString() to append agent/<product> when detected.
  • Expanded unit test regex (and added new detector tests) to accommodate/validate agent detection behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
lib/utils/agentDetector.ts Introduces env-var-based agent detection with “exactly one” rule.
lib/utils/buildUserAgentString.ts Appends agent/<product> to the generated User-Agent string.
tests/unit/utils/agentDetector.test.ts Adds unit coverage for detection outcomes across agents and edge cases.
tests/unit/utils/utils.test.ts Updates User-Agent regex to allow an optional agent/<product> suffix.
Comments suppressed due to low confidence (2)

tests/unit/utils/utils.test.ts:35

  • The inline spec/commentary for the User-Agent format above checkUserAgentString() still describes the header as ending at the closing ) of the comment, but the regex now allows an optional agent/<product> suffix. Update the test documentation/examples to reflect the new optional product token so the comments stay consistent with what the code accepts.
  function checkUserAgentString(ua: string, userAgentEntry?: string) {
    // Prefix: 'NodejsDatabricksSqlConnector/'
    // Version: three period-separated digits and optional suffix
    const re =
      /^(?<productName>NodejsDatabricksSqlConnector)\/(?<productVersion>\d+\.\d+\.\d+(-[^(]+)?)\s*\((?<comment>[^)]+)\)(\s+agent\/[a-z-]+)?$/i;

lib/utils/agentDetector.ts:8

  • The docstring says an agent env var being "present" is sufficient for detection, but the implementation treats only truthy/non-empty values as present (env[a.envVar]). Consider clarifying the comment to say the variable must be set to a non-empty value (to match the behavior and tests).
 * Detection only succeeds when exactly one agent environment variable is present,
 * to avoid ambiguous attribution when multiple agent environments overlap.
 *

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +31 to +36
let ua = `${productName}/${packageVersion} (${extra.join('; ')})`;

const agentProduct = detectAgent();
if (agentProduct) {
ua += ` agent/${agentProduct}`;
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new behavior of appending agent/<product> to the User-Agent isn't directly asserted anywhere. The updated regex makes existing tests pass even when the suffix is present, but it won't catch regressions where the suffix is missing or malformed when an agent env var is set. Add a unit test for buildUserAgentString() that sets a known agent env var (and cleans it up) and asserts the suffix is appended as expected.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — added a dedicated test (appends agent suffix when agent env var is set) in the latest push that sets CLAUDECODE=1, calls buildUserAgentString(), and asserts the output includes agent/claude-code.

Detect when the Node.js SQL driver is invoked by an AI coding agent
(e.g. Claude Code, Cursor, Gemini CLI) by checking well-known
environment variables, and append `agent/<product>` to the User-Agent
string.

This enables Databricks to understand how much driver usage originates
from AI coding agents. Detection only succeeds when exactly one agent
is detected to avoid ambiguous attribution.

Mirrors the approach in databricks/cli#4287.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
@vikrantpuppala
Copy link
Collaborator Author

Recreating from databricks org branch to fix CI permissions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant