Skip to content

Fix access token issue when connection pool enabled#1403

Open
absci wants to merge 11 commits intodevfrom
fix_connection_pool
Open

Fix access token issue when connection pool enabled#1403
absci wants to merge 11 commits intodevfrom
fix_connection_pool

Conversation

@absci
Copy link
Contributor

@absci absci commented Jul 19, 2022

No description provided.

@absci absci force-pushed the fix_connection_pool branch 10 times, most recently from b2dde91 to 97983ce Compare July 21, 2022 20:26
@absci absci force-pushed the fix_connection_pool branch 4 times, most recently from 870b7e5 to 72dd098 Compare July 26, 2022 06:02
@absci absci force-pushed the fix_connection_pool branch from 72dd098 to ae1f808 Compare July 26, 2022 06:08
@codecov
Copy link

codecov bot commented Jul 27, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.63%. Comparing base (109b8bc) to head (2fe3440).
Report is 105 commits behind head on dev.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #1403      +/-   ##
==========================================
+ Coverage   83.61%   83.63%   +0.02%     
==========================================
  Files          22       22              
  Lines        7853     8727     +874     
==========================================
+ Hits         6566     7299     +733     
- Misses       1287     1428     +141     

see 44 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@absci absci force-pushed the fix_connection_pool branch from 8bc5e4e to 71e7dcd Compare July 27, 2022 19:03
@absci absci force-pushed the fix_connection_pool branch from 71e7dcd to 6293a91 Compare July 27, 2022 19:41
@absci absci requested a review from v-mabarw August 2, 2022 21:47
@v-mabarw v-mabarw requested a review from v-makouz August 29, 2022 23:26
@v-makouz
Copy link
Contributor

Fix for issue #1396

@ibakirov
Copy link

Any update on this?

absci pushed a commit to absci/msphpsql that referenced this pull request Nov 4, 2025
- Identified critical security vulnerability (authentication bypass)
- Analyzed proposed fix for access token pooling issue
- Raised concerns about platform coverage (Windows NTS only)
- Documented testing and security requirements
- Provided recommendations for improvement
@David-Engel
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@David-Engel
Copy link
Collaborator

AI Analysis of this PR

The PR attempts the token-cache approach for Windows NTS builds while keeping the existing (buggy) early-free for ZTS. It's the right direction, but has several serious issues:

1. Double-free: dual ownership of token memory

The same ACCESSTOKEN* is stored in two places:

  • conn->azure_ad_access_token — a sqlsrv_malloc_auto_ptr that owns the memory
  • The global access_tokens[] array — a raw pointer

When the connection is destroyed (core_sqlsrv_closesqlsrv_free(conn)), the auto_ptr destructor frees the token. Then at PHP_RSHUTDOWN, the cleanup loop tries to free it again via the global array → double free / use-after-free.

2. Double-free on same-token reuse path

When the same token content is found in the cache:

sqlsrv_free(get_access_tokens()[current_token_index]); // frees the buffer

This frees the allocation that the previous connection's azure_ad_access_token auto_ptr still thinks it owns. When that old connection is eventually closed, the auto_ptr destructor triggers another free of already-freed memory.

3. Doesn't actually achieve pooling for identical tokens

The "same token found" path:

  1. Frees the old allocation at slot X
  2. Allocates a new buffer (different pointer address)
  3. Stores the new pointer in slot X

But the ODBC driver already hashed the old pointer for the pooled connection's key. The new pointer has a different address → different HashPtr result → pool miss. The only way to get a pool hit is to reuse the exact same pointer address, which this code doesn't guarantee.

4. Windows NTS only

The fix is guarded by #if defined(_WIN32) && !defined(ZTS). Linux builds and thread-safe (ZTS) builds still have the original early-free bug with its dangling pointer and random pool behavior.

5. Minor issues

  • Typo in comment: "access_toiens"
  • O(n) linear scan with per-character string reconstruction on every connection — could be expensive with many distinct tokens
  • sqlsrv_realloc on every new token grows the array one slot at a time

What a correct implementation would need

For the token cache to work properly:

  1. Single owner: The global cache must be the sole owner of token allocations. conn->azure_ad_access_token should store a non-owning raw pointer (or be removed), and the early free must be deleted.
  2. Same pointer for same content: When the same token content is found, return the existing pointer — don't free-and-reallocate. This is what makes HashPtr produce a pool hit.
  3. Reference counting or epoch-based cleanup: Track how many live connections reference each cached token, so you know when it's safe to free (and securely wipe).
  4. Platform-agnostic: The fix should work on Linux and ZTS too, with appropriate synchronization for thread safety.

Tests

For a fix that changes connection pooling semantics around access tokens, you'd want tests covering at minimum:

  1. Same token → pool hit: Two connections with identical access tokens should reuse the same pooled connection (verify with connection ID or session-level state).

  2. Different tokens → pool miss: Two connections with different access tokens must get distinct connections (verify with SELECT USER_NAME() returning different identities — the exact scenario from issue An existing connection in the connection pool is reused unexpectedly even though it has a different access token. #1396).

  3. Token lifetime / no access violation: Open a connection with a token, close it, open another with a different token, use it — no crash or memory corruption.

  4. Pooling disabled: With ConnectionPooling = false, each connection is always fresh regardless of token content.

The existing tests cover basic access token validation (empty token, conflicting credentials) and general connection pooling, but there's no test that combines both — i.e., testing pooling behavior when access tokens are involved.

A proper fix for #1396 should include tests like:

Test 1: Different tokens must not share a pooled connection

// Connect with token1, get identity
$conn1 = new PDO("sqlsrv:server=$server; Database=$db; AccessToken=$token1; ConnectionPooling=1");
$user1 = $conn1->query("SELECT USER_NAME()")->fetchColumn();
unset($conn1); // returns to pool

// Connect with token2, must NOT get token1's connection
$conn2 = new PDO("sqlsrv:server=$server; Database=$db; AccessToken=$token2; ConnectionPooling=1");
$user2 = $conn2->query("SELECT USER_NAME()")->fetchColumn();
unset($conn2);

assert($user1 !== $user2, "Different tokens must yield different identities");

Test 2: Same token should reuse a pooled connection

$conn1 = new PDO("sqlsrv:server=$server; Database=$db; AccessToken=$token1; ConnectionPooling=1");
$spid1 = $conn1->query("SELECT @@SPID")->fetchColumn();
unset($conn1);

$conn2 = new PDO("sqlsrv:server=$server; Database=$db; AccessToken=$token1; ConnectionPooling=1");
$spid2 = $conn2->query("SELECT @@SPID")->fetchColumn();
unset($conn2);

assert($spid1 === $spid2, "Same token should reuse pooled connection");

Test 3: Pooling disabled always creates fresh connections

$conn1 = new PDO("sqlsrv:server=$server; Database=$db; AccessToken=$token1; ConnectionPooling=0");
$spid1 = $conn1->query("SELECT @@SPID")->fetchColumn();
unset($conn1);

$conn2 = new PDO("sqlsrv:server=$server; Database=$db; AccessToken=$token1; ConnectionPooling=0");
$spid2 = $conn2->query("SELECT @@SPID")->fetchColumn();
unset($conn2);

assert($spid1 !== $spid2, "Pooling disabled must not reuse connections");

These tests require Azure AD infrastructure (two identities with tokens), so they'd need to be gated behind environment variables similar to the existing access_token.inc pattern, running only in CI environments with the proper Azure setup. But without them, there's no way to verify the fix actually solves the problem — and no regression safety net.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants