Skip to content

feat(ci): dynamic test duration tracking for pytest-split#1151

Open
vringar wants to merge 3 commits intomasterfrom
dynamic-test-durations
Open

feat(ci): dynamic test duration tracking for pytest-split#1151
vringar wants to merge 3 commits intomasterfrom
dynamic-test-durations

Conversation

@vringar
Copy link
Contributor

@vringar vringar commented Mar 2, 2026

Summary

  • Add pytest hooks in test/conftest.py to record per-test wall-clock durations, gated on GROUP env var (CI only)
  • Update scripts/ci.sh to prefer cached durations over the committed .test_durations baseline
  • Add cache restore/save steps and a consolidate-durations job to .github/workflows/run-tests.yaml that merges per-group artifacts into a single cached file

How it works

Per-run data flow:

Restore cache → ci.sh picks cached or committed file → pytest-split distributes tests
    → conftest hook records durations → upload per-group artifacts
    → consolidation job merges all groups → save to cache

Fallback chain (priority order):

  1. Cached durations from the most recent run on the same branch
  2. Cached durations from the most recent run on master
  3. Committed .test_durations file (unchanged, serves as baseline)

Test plan

  • Run pytest test/test_dataclass_validations.py -v with GROUP=1 — verified .test_durations_group_1 is created with correct node IDs matching .test_durations format
  • Verify CI workflow: cache restore (miss on first run), artifact upload, consolidation, cache save
  • Push again to verify cache hit and that splitting uses cached durations

Record per-test durations from every CI run and cache them so
subsequent runs use the freshest data for splitting. The committed
.test_durations remains as a fallback when no cache exists.

- Add pytest hooks in test/conftest.py to record durations per group
- Update scripts/ci.sh to prefer cached durations over committed baseline
- Add cache restore, artifact upload, and consolidate-durations job
  to the GitHub Actions workflow
Copilot AI review requested due to automatic review settings March 2, 2026 23:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds CI-driven feedback loops for pytest-split by collecting per-test durations during CI runs, consolidating them across split groups, and caching the merged result so future CI runs can split tests using the most recent timings.

Changes:

  • Add pytest session hooks to record durations into per-group .test_durations_group_* files (CI-only via GROUP).
  • Update scripts/ci.sh to prefer a cached durations file (.test_durations_cached) over the committed baseline (.test_durations).
  • Extend the GitHub Actions workflow to restore/save the cache and add a consolidation job that merges per-group artifacts into the cached durations file.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
test/conftest.py Adds pytest hooks to collect and write per-test duration JSON for the current CI split group.
scripts/ci.sh Uses a cached durations file when available and passes it to pytest-split via --durations-path.
.github/workflows/run-tests.yaml Restores durations cache for test jobs, uploads per-group duration artifacts, and consolidates/saves merged durations back to cache.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +20 to +22
def pytest_runtest_logreport(report):
if report.when == "call":
_test_durations[report.nodeid] = round(report.duration, 2)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest_runtest_logreport records durations for every local pytest run, even though the PR description says this should be gated on GROUP (CI-only). Consider checking os.environ.get("GROUP") (or a config flag) before updating _test_durations so local runs don't incur the hook overhead or generate unexpected in-memory state.

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +22
def pytest_runtest_logreport(report):
if report.when == "call":
_test_durations[report.nodeid] = round(report.duration, 2)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durations you persist are only for the call phase (report.when == "call"), but the PR description calls these "per-test wall-clock durations". For tests with expensive fixture setup/teardown (or skips/errors during setup), this will significantly undercount and can lead to poorer splitting; consider accumulating setup + call + teardown durations per nodeid (or otherwise clarifying/aligning what "wall-clock" means here).

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.75%. Comparing base (12fae22) to head (4a6ef17).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1151   +/-   ##
=======================================
  Coverage   56.75%   56.75%           
=======================================
  Files          40       40           
  Lines        3908     3908           
=======================================
  Hits         2218     2218           
  Misses       1690     1690           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vringar added 2 commits March 3, 2026 13:28
upload-artifact@v4 defaults to include-hidden-files: false,
which silently excludes .test_durations_group_* files.
Also removes debug logging from conftest.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants