feat(ci): dynamic test duration tracking for pytest-split by vringar · Pull Request #1151 · openwpm/OpenWPM

vringar · 2026-03-02T23:17:24Z

Summary

Add pytest hooks in test/conftest.py to record per-test wall-clock durations, gated on GROUP env var (CI only)
Update scripts/ci.sh to prefer cached durations over the committed .test_durations baseline
Add cache restore/save steps and a consolidate-durations job to .github/workflows/run-tests.yaml that merges per-group artifacts into a single cached file

How it works

Per-run data flow:

Restore cache → ci.sh picks cached or committed file → pytest-split distributes tests
    → conftest hook records durations → upload per-group artifacts
    → consolidation job merges all groups → save to cache

Fallback chain (priority order):

Cached durations from the most recent run on the same branch
Cached durations from the most recent run on master
Committed .test_durations file (unchanged, serves as baseline)

Test plan

Run pytest test/test_dataclass_validations.py -v with GROUP=1 — verified .test_durations_group_1 is created with correct node IDs matching .test_durations format
Verify CI workflow: cache restore (miss on first run), artifact upload, consolidation, cache save
Push again to verify cache hit and that splitting uses cached durations

Record per-test durations from every CI run and cache them so subsequent runs use the freshest data for splitting. The committed .test_durations remains as a fallback when no cache exists. - Add pytest hooks in test/conftest.py to record durations per group - Update scripts/ci.sh to prefer cached durations over committed baseline - Add cache restore, artifact upload, and consolidate-durations job to the GitHub Actions workflow

Copilot

Pull request overview

This PR adds CI-driven feedback loops for pytest-split by collecting per-test durations during CI runs, consolidating them across split groups, and caching the merged result so future CI runs can split tests using the most recent timings.

Changes:

Add pytest session hooks to record durations into per-group .test_durations_group_* files (CI-only via GROUP).
Update scripts/ci.sh to prefer a cached durations file (.test_durations_cached) over the committed baseline (.test_durations).
Extend the GitHub Actions workflow to restore/save the cache and add a consolidation job that merges per-group artifacts into the cached durations file.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`test/conftest.py`	Adds pytest hooks to collect and write per-test duration JSON for the current CI split group.
`scripts/ci.sh`	Uses a cached durations file when available and passes it to `pytest-split` via `--durations-path`.
`.github/workflows/run-tests.yaml`	Restores durations cache for test jobs, uploads per-group duration artifacts, and consolidates/saves merged durations back to cache.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T23:28:22Z

test/conftest.py

+def pytest_runtest_logreport(report):
+    if report.when == "call":
+        _test_durations[report.nodeid] = round(report.duration, 2)


pytest_runtest_logreport records durations for every local pytest run, even though the PR description says this should be gated on GROUP (CI-only). Consider checking os.environ.get("GROUP") (or a config flag) before updating _test_durations so local runs don't incur the hook overhead or generate unexpected in-memory state.

Copilot · 2026-03-02T23:28:22Z

test/conftest.py

+def pytest_runtest_logreport(report):
+    if report.when == "call":
+        _test_durations[report.nodeid] = round(report.duration, 2)


The durations you persist are only for the call phase (report.when == "call"), but the PR description calls these "per-test wall-clock durations". For tests with expensive fixture setup/teardown (or skips/errors during setup), this will significantly undercount and can lead to poorer splitting; consider accumulating setup + call + teardown durations per nodeid (or otherwise clarifying/aligning what "wall-clock" means here).

codecov · 2026-03-03T00:40:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.75%. Comparing base (12fae22) to head (4a6ef17).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1151   +/-   ##
=======================================
  Coverage   56.75%   56.75%           
=======================================
  Files          40       40           
  Lines        3908     3908           
=======================================
  Hits         2218     2218           
  Misses       1690     1690

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

upload-artifact@v4 defaults to include-hidden-files: false, which silently excludes .test_durations_group_* files. Also removes debug logging from conftest.

Copilot AI review requested due to automatic review settings March 2, 2026 23:17

Copilot started reviewing on behalf of vringar March 2, 2026 23:18 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

vringar added 2 commits March 3, 2026 13:28

debug: add logging to duration recording hooks

33c8ad3

fix: add include-hidden-files for dotfile artifacts

4a6ef17

upload-artifact@v4 defaults to include-hidden-files: false, which silently excludes .test_durations_group_* files. Also removes debug logging from conftest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci): dynamic test duration tracking for pytest-split#1151

feat(ci): dynamic test duration tracking for pytest-split#1151
vringar wants to merge 3 commits intomasterfrom
dynamic-test-durations

vringar commented Mar 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

codecov bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vringar commented Mar 2, 2026

Summary

How it works

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 3, 2026 •

edited

Loading