fix(framework): split `Output` TypeVar into `Output` and `Expected` by AbhiPrasad · Pull Request #243 · braintrustdata/braintrust-sdk-python

Abhijeet Prasad (AbhiPrasad) · 2026-04-09T17:24:18Z

resolves #240

The Eval generic Output parameter was shared across three positions: task return type, EvalCase.expected, and scorer args. When the expected data type differs from the task output (e.g. assertion specs vs model output), type checkers reject the call because Output can't unify.

Introduce a separate Expected TypeVar so data binds Expected and task binds Output independently. Add a test_types nox session that runs pyright, mypy, and pytest on py/src/braintrust/type_tests/.

…240) The `Eval` generic `Output` parameter was shared across three positions: task return type, `EvalCase.expected`, and scorer args. When the expected data type differs from the task output (e.g. assertion specs vs model output), type checkers reject the call because `Output` can't unify. Introduce a separate `Expected` TypeVar so `data` binds `Expected` and `task` binds `Output` independently. Add a `test_types` nox session that runs pyright, mypy, and pytest on `py/src/braintrust/type_tests/`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pin to pyright==1.1.408 and mypy==1.20.0 to avoid flaky CI from upstream type checker releases introducing stricter checks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Evaluator is now Generic[Input, Output, Expected] after the TypeVar split. Python 3.10 enforces generic param counts at runtime, so the 2-param Evaluator[Any, Any] in server.py caused a TypeError on import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Includes - #227 - #234 - #225 - #232 - #238 - #242 - #237 - #241 - #243 - #245

Abhijeet Prasad (AbhiPrasad) self-assigned this Apr 9, 2026

Andrew Kent (realark) approved these changes Apr 9, 2026

View reviewed changes

Nova (SFK) and others added 2 commits April 9, 2026 22:48

fix(noxfile): pin pyright and mypy versions in test_types session

dbe6a10

Pin to pyright==1.1.408 and mypy==1.20.0 to avoid flaky CI from upstream type checker releases introducing stricter checks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Abhijeet Prasad (AbhiPrasad) force-pushed the fix/eval-expected-typevar branch from 9747351 to dbe6a10 Compare April 9, 2026 23:03

Abhijeet Prasad (AbhiPrasad) force-pushed the fix/eval-expected-typevar branch from 342a58f to e619a9f Compare April 10, 2026 00:24

Abhijeet Prasad (AbhiPrasad) merged commit f2412f0 into main Apr 10, 2026
59 checks passed

Abhijeet Prasad (AbhiPrasad) deleted the fix/eval-expected-typevar branch April 10, 2026 01:09

Abhijeet Prasad (AbhiPrasad) mentioned this pull request Apr 10, 2026

chore: Bump version to 0.14.0 #251

Merged

Abhijeet Prasad (AbhiPrasad) added a commit that referenced this pull request Apr 10, 2026

chore: Bump version to 0.14.0 (#251)

3b34248

Includes - #227 - #234 - #225 - #232 - #238 - #242 - #237 - #241 - #243 - #245

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(framework): split `Output` TypeVar into `Output` and `Expected`#243

fix(framework): split `Output` TypeVar into `Output` and `Expected`#243
Abhijeet Prasad (AbhiPrasad) merged 3 commits intomainfrom
fix/eval-expected-typevar

Abhijeet Prasad (AbhiPrasad) commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Abhijeet Prasad (AbhiPrasad) commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants