-
Notifications
You must be signed in to change notification settings - Fork 136
[PECOBLR-1928] Add AI coding agent detection to User-Agent header #740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| """ | ||
| Detects whether the Python SQL connector is being invoked by an AI coding agent | ||
| by checking for well-known environment variables that agents set in their spawned | ||
| shell processes. | ||
|
|
||
| Detection only succeeds when exactly one agent environment variable is present, | ||
| to avoid ambiguous attribution when multiple agent environments overlap. | ||
|
|
||
| Adding a new agent requires only a new entry in KNOWN_AGENTS. | ||
|
|
||
| References for each environment variable: | ||
| - ANTIGRAVITY_AGENT: Closed source. Google Antigravity sets this variable. | ||
| - CLAUDECODE: https://github.com/anthropics/claude-code (sets CLAUDECODE=1) | ||
| - CLINE_ACTIVE: https://github.com/cline/cline (shipped in v3.24.0) | ||
| - CODEX_CI: https://github.com/openai/codex (part of UNIFIED_EXEC_ENV array in codex-rs) | ||
| - CURSOR_AGENT: Closed source. Referenced in a gist by johnlindquist. | ||
| - GEMINI_CLI: https://google-gemini.github.io/gemini-cli/docs/tools/shell.html (sets GEMINI_CLI=1) | ||
| - OPENCODE: https://github.com/opencode-ai/opencode (sets OPENCODE=1) | ||
| """ | ||
|
|
||
| import os | ||
|
|
||
| KNOWN_AGENTS = [ | ||
| ("ANTIGRAVITY_AGENT", "antigravity"), | ||
| ("CLAUDECODE", "claude-code"), | ||
| ("CLINE_ACTIVE", "cline"), | ||
| ("CODEX_CI", "codex"), | ||
| ("CURSOR_AGENT", "cursor"), | ||
| ("GEMINI_CLI", "gemini-cli"), | ||
| ("OPENCODE", "opencode"), | ||
| ] | ||
|
|
||
|
|
||
| def detect(env=None): | ||
| """Detect which AI coding agent (if any) is driving the current process. | ||
|
|
||
| Args: | ||
| env: Optional dict-like object for environment variable lookup. | ||
| Defaults to os.environ. Exists for testability. | ||
|
|
||
| Returns: | ||
| The agent product string if exactly one agent is detected, | ||
| or an empty string otherwise. | ||
| """ | ||
| if env is None: | ||
| env = os.environ | ||
|
|
||
| detected = [product for var, product in KNOWN_AGENTS if env.get(var)] | ||
|
|
||
| if len(detected) == 1: | ||
| return detected[0] | ||
| return "" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -914,12 +914,18 @@ def build_client_context(server_hostname: str, version: str, **kwargs): | |
| ) | ||
|
|
||
| # Build user agent | ||
| from databricks.sql.common.agent import detect as detect_agent | ||
|
||
|
|
||
| user_agent_entry = kwargs.get("user_agent_entry", "") | ||
| if user_agent_entry: | ||
| user_agent = f"PyDatabricksSqlConnector/{version} ({user_agent_entry})" | ||
| else: | ||
| user_agent = f"PyDatabricksSqlConnector/{version}" | ||
|
|
||
| agent_product = detect_agent() | ||
| if agent_product: | ||
| user_agent += f" agent/{agent_product}" | ||
|
Comment on lines
+925
to
+927
|
||
|
|
||
| # Explicitly construct ClientContext with proper types | ||
| return ClientContext( | ||
| hostname=server_hostname, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| import pytest | ||
| from databricks.sql.common.agent import detect, KNOWN_AGENTS | ||
|
|
||
|
|
||
| class TestAgentDetection: | ||
| def test_detects_single_agent_claude_code(self): | ||
| assert detect({"CLAUDECODE": "1"}) == "claude-code" | ||
|
|
||
| def test_detects_single_agent_cursor(self): | ||
| assert detect({"CURSOR_AGENT": "1"}) == "cursor" | ||
|
|
||
| def test_detects_single_agent_gemini_cli(self): | ||
| assert detect({"GEMINI_CLI": "1"}) == "gemini-cli" | ||
|
|
||
| def test_detects_single_agent_cline(self): | ||
| assert detect({"CLINE_ACTIVE": "1"}) == "cline" | ||
|
|
||
| def test_detects_single_agent_codex(self): | ||
| assert detect({"CODEX_CI": "1"}) == "codex" | ||
|
|
||
| def test_detects_single_agent_opencode(self): | ||
| assert detect({"OPENCODE": "1"}) == "opencode" | ||
|
|
||
| def test_detects_single_agent_antigravity(self): | ||
| assert detect({"ANTIGRAVITY_AGENT": "1"}) == "antigravity" | ||
|
|
||
| def test_returns_empty_when_no_agent_detected(self): | ||
| assert detect({}) == "" | ||
|
|
||
| def test_returns_empty_when_multiple_agents_detected(self): | ||
| assert detect({"CLAUDECODE": "1", "CURSOR_AGENT": "1"}) == "" | ||
|
|
||
| def test_ignores_empty_env_var_values(self): | ||
| assert detect({"CLAUDECODE": ""}) == "" | ||
|
|
||
| def test_all_known_agents_are_covered(self): | ||
| for env_var, product in KNOWN_AGENTS: | ||
| assert detect({env_var: "1"}) == product, ( | ||
| f"Agent with env var {env_var} should be detected as {product}" | ||
| ) | ||
|
|
||
| def test_defaults_to_os_environ(self, monkeypatch): | ||
| monkeypatch.delenv("CLAUDECODE", raising=False) | ||
| monkeypatch.delenv("CURSOR_AGENT", raising=False) | ||
| monkeypatch.delenv("GEMINI_CLI", raising=False) | ||
| monkeypatch.delenv("CLINE_ACTIVE", raising=False) | ||
| monkeypatch.delenv("CODEX_CI", raising=False) | ||
| monkeypatch.delenv("OPENCODE", raising=False) | ||
| monkeypatch.delenv("ANTIGRAVITY_AGENT", raising=False) | ||
| # With all agent vars cleared, detect() should return empty | ||
| assert detect() == "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding integration tests to verify that the agent detection is properly integrated into the User-Agent header. The existing test_useragent_header in test_session.py could be extended to verify that when an agent environment variable is set, the User-Agent header includes the agent suffix. This would ensure the integration works end-to-end, not just the detection logic in isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detection logic is fully covered by unit tests in
test_agent_detection.py. The integration insession.pyis a 3-line append that is straightforward. Adding an integration test here would require mocking the full Session constructor which adds complexity without meaningful coverage gain.