Skip to content

feat: add SQL Server 2025 VECTOR type support#307

Open
dlevy-msft-sql wants to merge 1 commit intomicrosoft:mainfrom
dlevy-msft-sql:feature/vector-support
Open

feat: add SQL Server 2025 VECTOR type support#307
dlevy-msft-sql wants to merge 1 commit intomicrosoft:mainfrom
dlevy-msft-sql:feature/vector-support

Conversation

@dlevy-msft-sql
Copy link

@dlevy-msft-sql dlevy-msft-sql commented Jan 14, 2026

SQL Server 2025 VECTOR Type Support

Overview

This PR adds native support for SQL Server 2025's VECTOR data type, enabling efficient storage and retrieval of vector embeddings for AI/ML workloads in Go applications.

The VECTOR type is designed for similarity search scenarios, storing fixed-dimensional arrays of floating-point numbers optimized for operations like VECTOR_DISTANCE.

Features

Core Types

  • Vector - Vector type implementing driver.Valuer and sql.Scanner interfaces
  • NullVector - Nullable wrapper for database columns that allow NULL values
  • VectorElementType - Enum for element precision (float32/float16)

Element Type Support

Type Constant Bytes Max Dimensions Notes
float32 VectorElementFloat32 4 1998 Default, fully supported
float16 VectorElementFloat16 2 3996 Preview feature, requires PREVIEW_FEATURES = ON

Wire Format Support

  • Binary TDS format: Native encoding/decoding matching SQL Server's internal format
  • JSON format: Backward-compatible parameter transmission for older drivers/servers
  • Automatic format selection: Uses binary when supported, falls back to JSON

Framework Compatibility

  • Direct support for []float32 and []float64 parameter binding (no wrapper needed)
  • Vector and NullVector types implement sql.Scanner for decoding binary data
  • Float16 vectors preserve ElementType metadata when scanned to Vector/NullVector

Connection String Option

vectortypesupport=v1    # Enable native binary TDS format (SQL Server 2025+)
vectortypesupport=off   # Default: JSON format for backward compatibility

API Summary

Creating Vectors

// From float32 slice (default)
v, err := mssql.NewVector([]float32{1.0, 2.0, 3.0})

// From float64 slice (converted to float32)
v, err := mssql.NewVectorFromFloat64([]float64{1.0, 2.0, 3.0})

// With explicit element type (for float16)
v, err := mssql.NewVectorWithType(mssql.VectorElementFloat16, values)

Inserting Vectors

// Using Vector type
_, err = db.Exec("INSERT INTO embeddings (v) VALUES (@p1)", v)

// Using []float32 directly (convenient for frameworks)
_, err = db.Exec("INSERT INTO embeddings (v) VALUES (@p1)", []float32{1.0, 2.0, 3.0})

// Using []float64 (auto-converted to float32)
_, err = db.Exec("INSERT INTO embeddings (v) VALUES (@p1)", []float64{1.0, 2.0, 3.0})

Reading Vectors

// To Vector type (recommended)
var v mssql.Vector
err := row.Scan(&v)
fmt.Println(v.Dimensions(), v.Values())

// Nullable columns
var nv mssql.NullVector
err := row.Scan(&nv)
if nv.Valid {
    // Use nv.Vector
}

Similarity Search

queryVec, _ := mssql.NewVector([]float32{1.0, 0.0, 0.0})
rows, _ := db.Query(`
    SELECT name, VECTOR_DISTANCE('cosine', embedding, @p1) as distance
    FROM documents
    ORDER BY distance
`, queryVec)

Files Changed

New Files

File Purpose
vector.go Core Vector/NullVector types, encoding/decoding, float16 conversion
vector_test.go Comprehensive unit tests (50+ test cases)
vector_db_test.go Database integration tests with SQL Server 2025
doc/how-to-use-vectors.md Complete usage guide with examples

Modified Files

File Changes
types.go Added typeVectorN constant, TDS read/write functions, type metadata
mssql.go Vector parameter binding with binary/JSON format selection
mssql_go19.go convertInputParameter for Vector types
tds.go Feature extension negotiation for vector support
token.go Feature ack parsing for vector support
msdsn/conn_str.go vectortypesupport connection string parameter
msdsn/conn_str_test.go Connection string parsing tests
tds_login_test.go Login packet tests with vector feature extension
README.md Added Vector to supported types list
CHANGELOG.md Version 1.9.7 feature documentation
.gitignore Additional test artifact patterns

Implementation Details

TDS Protocol Integration

  • New feature extension featExtVECTORSUPPORT (0x0E) for LOGIN7 negotiation
  • Binary format: 8-byte header + float32/float16 payload
  • Header structure: magic (0xA9), version (0x01), dimensions (2 bytes), element type, reserved (3 bytes)

NULL Handling

  • NullVector{Valid: false} sends as NVARCHAR(1) NULL for dimension-agnostic NULL insertion
  • This workaround avoids SQL Server's requirement for matching dimensions on NULL vector parameters

Special Value Handling

  • NaN: Encoded as JSON null, decoded back to NaN for round-trip support
  • Infinity: Rejected with error in Value() since JSON cannot represent ±Inf losslessly
  • Precision warnings: Optional callback for float64→float32 precision loss detection

Thread Safety

  • SetVectorPrecisionLossHandler() uses mutex for thread-safe handler updates
  • Precision warnings fire once per vector (first loss only) for performance

Driver Value Types

The readVectorType function returns []byte (raw binary vector payload), which is a standard database/sql/driver.Value type. This follows Go database driver conventions where drivers return primitive types and custom types handle decoding via sql.Scanner.

Applications should scan to Vector or NullVector types, which implement sql.Scanner and decode the binary representation automatically.

Testing

Unit Tests (go test ./...)

  • Vector encoding/decoding (float32 and float16)
  • Float16 conversion accuracy
  • JSON serialization/deserialization
  • NULL handling
  • Error cases (invalid data, unsupported types)
  • Maximum dimension limits
  • Type metadata functions

Integration Tests (requires SQL Server 2025)

  • Insert/select round-trips
  • NULL vector handling
  • Different dimension counts (1D to 500D)
  • Special floating-point values
  • VECTOR_DISTANCE similarity search
  • Column metadata verification
  • Batch operations
  • Float16 with PREVIEW_FEATURES

Test Coverage

  • New vector-specific tests: 50+ test functions
  • All existing tests continue to pass
  • Graceful skip on pre-2025 SQL Server instances

Requirements

  • SQL Server 2025 or later for VECTOR type support
  • go-mssqldb 1.9.7 or later
  • For float16: ALTER DATABASE SCOPED CONFIGURATION SET PREVIEW_FEATURES = ON

Migration Notes

  • Backward compatible: Default vectortypesupport=off uses JSON format
  • No breaking changes: Existing code continues to work unchanged
  • Opt-in optimization: Set vectortypesupport=v1 for binary format with SQL Server 2025+

Related Documentation

Acknowledgments

Thanks to @shueybubbles for the review feedback that improved:

  • Direct []float32 and []float64 parameter support
  • Thread-safe precision warning API

@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 61.45340% with 244 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.05%. Comparing base (c16a19e) to head (4d75a8c).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
vector.go 81.84% 62 Missing and 11 partials ⚠️
types.go 34.54% 72 Missing ⚠️
mssql.go 0.00% 67 Missing ⚠️
tds.go 47.36% 18 Missing and 2 partials ⚠️
token.go 0.00% 8 Missing ⚠️
mssql_go19.go 50.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #307      +/-   ##
==========================================
+ Coverage   75.36%   77.05%   +1.68%     
==========================================
  Files          34       35       +1     
  Lines        6597     7213     +616     
==========================================
+ Hits         4972     5558     +586     
- Misses       1337     1364      +27     
- Partials      288      291       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds native support for SQL Server 2025's VECTOR data type, enabling efficient storage and retrieval of vector embeddings for AI/ML workloads. The implementation includes Vector and NullVector types that implement standard database interfaces, support for both float32 and float16 element types, binary format encoding/decoding, JSON format transmission for parameterized queries, and comprehensive test coverage with graceful degradation for pre-2025 servers.

Changes:

  • Added Vector and NullVector types with driver.Valuer and sql.Scanner interface implementations
  • Implemented binary format encoding/decoding matching SQL Server's native TDS format
  • Added float16 conversion functions with IEEE 754 half-precision support
  • Extended types.go with typeVectorN constant and read/write functions for vector data handling
  • Updated parameter binding in mssql.go to transmit vectors as JSON strings for backward compatibility

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vector.go Core Vector type implementation with encoding/decoding, float16 conversion, and helper functions
vector_test.go Comprehensive unit tests for Vector type including encoding, decoding, and edge cases
vector_db_test.go Database integration tests with SQL Server 2025+ version detection and preview feature handling
types.go Added typeVectorN constant and TDS stream read/write functions for vector data
mssql.go Extended makeParam to handle Vector/NullVector types via JSON string transmission
doc/how-to-use-vectors.md Complete usage documentation with examples and best practices
README.md Added Vector type to supported features list
CHANGELOG.md Documented new feature in version 1.9.4

@shueybubbles
Copy link
Collaborator

shueybubbles commented Jan 14, 2026

@copilot what's the expected behavior if an app tries to scan a vector to []float64 or insert a []float64 to a vector column?

@dlevy-msft-sql dlevy-msft-sql added enhancement New feature or request Area - data types Issues related to data types Size: S Small issue (less than one week effort, less than 250 lines of code) labels Jan 15, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 9 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 6 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 7 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 21 changed files in this pull request and generated 3 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 21 changed files in this pull request and generated 3 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 21 changed files in this pull request and generated 3 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 21 changed files in this pull request and generated 3 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 21 changed files in this pull request and generated 2 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 21 changed files in this pull request and generated no new comments.

@dlevy-msft-sql dlevy-msft-sql added in-review Priority: 2 Medium priority/impact Size: L Large issue (four or more weeks effort, less than 2500 lines of code) Size: XL Less than 5000 lines of code that cannot be broken up Priority: 3 Low priority/impact and removed needs-work Size: M Medium issue (two-three weeks effort, less than 1000 lines of code) Size: L Large issue (four or more weeks effort, less than 2500 lines of code) Priority: 2 Medium priority/impact labels Feb 2, 2026
@dlevy-msft-sql dlevy-msft-sql changed the title FEATURE: SQL Server 2025 VECTOR type support feat: add SQL Server 2025 VECTOR type support Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - data types Issues related to data types enhancement New feature or request Priority: 3 Low priority/impact Size: XL Less than 5000 lines of code that cannot be broken up

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants