Skip to content

Parallelize the unit and integration tests#520

Draft
kmontemayor2-sc wants to merge 13 commits intomainfrom
kmonte/parallel-unit-tests
Draft

Parallelize the unit and integration tests#520
kmontemayor2-sc wants to merge 13 commits intomainfrom
kmonte/parallel-unit-tests

Conversation

@kmontemayor2-sc
Copy link
Collaborator

@kmontemayor2-sc kmontemayor2-sc commented Feb 27, 2026

Scope of work done

Summary

Adds parallel test sharding to CI, splitting Python unit tests and integration tests across 4 Cloud Build shards running concurrently. Slow
test modules are explicitly pinned to specific shards via ordered tuples, preventing hash-collision hotspots. Unpinned modules are assigned via
SHA-256 hash.

  • New run-sharded-cloud-build GitHub Action that launches N parallel Cloud Build jobs (one per shard), streams logs with [shard_N] prefixes,
    and reports pass/fail with build URLs
  • Sharding infrastructure in test_utils.py: _get_shard_for_module() (position-based for pinned, SHA-256 for others), _filter_tests_by_shard(),
    new CLI args --shard_index/--total_shards
  • Pinned module constants in tests/unit/main.py (5 modules, ~88% of runtime) and tests/integration/main.py (8 modules, ~96% of runtime), with
    measured durations as comments
  • CI workflow updates: on-pr-comment.yml and on-pr-merge.yml now use sharded execution with 4 shards, timeout reduced from 120 to 60 minutes
  • Comprehensive tests in test_sharding_test.py: basic sharding (coverage, no-overlap, determinism), pinning behavior, and real-discovery tests
    that scan tests/unit/ to verify sharded == unsharded

As a followup, we should probably move that inlined bash script to be python but then we'd need to add some new ci python deps that we just install for ci purposes and that seems like a different change.

NOTE: I tested this in a slightly different on: push path, I'll test this change when it gets merged and if the ci setup breaks I can revert

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

@kmontemayor2-sc
Copy link
Collaborator Author

/unit_test_py

@kmontemayor2-sc
Copy link
Collaborator Author

/integration_test

@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2026

GiGL Automation

@ 03:39:24UTC : 🔄 Python Unit Test started.

@ 04:55:30UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2026

GiGL Automation

@ 03:39:30UTC : 🔄 Integration Test started.

@ 05:02:55UTC : ✅ Workflow completed successfully.

@kmontemayor2-sc kmontemayor2-sc changed the title Try to parallelize the unit tests Parallelize the unit and integration tests Feb 27, 2026
Copy link
Collaborator

@svij-sc svij-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tried ways to parallelize the tests on a single machine https://github.com/mtreinish/stestr ?
or various pytest add-ons?

This adds a lot of complexity :\ - want to be sure we have considered alternatives.

@kmontemayor2-sc
Copy link
Collaborator Author

Have we tried ways to parallelize the tests on a single machine https://github.com/mtreinish/stestr ?
or various pytest add-ons?

So we actually already have some sort of single machine parallelization 1 but last I checked it just doesn't allow our tests to run properly.

Though we have a bigger problem in that the runners we use for CICD are on the small size and are prone to OOMing if the process tree gets too big (this happened to me quite a bit with the graph_store_integration_test), so even if we do enable single-machine parallelization it wouldn't really help us in CI/CD.

For local, I don't think parallel tests are that useful either, as debugging the parallel. test traces is quite difficult and in practice I don't usually run all of the tests locally, just the ones that should be pertinent to my changes.

Then, I'd push to github and /unit_test_py and run all of the tests, to make sure that there aren't unexpected breakages.

This change is to support the above step, and I agree isn't strictly necessary, but if we want parallel tests on CI/CD I don't really see another way forward here, unless we want to get custom bigger runners but even then debugging will be difficult.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

GiGL Automation

@ 16:25:50UTC : 🔄 Python Unit Test started.

@ 17:40:29UTC : ✅ Workflow completed successfully.

@svij-sc
Copy link
Collaborator

svij-sc commented Mar 4, 2026

As a followup, we should probably move that inlined bash script to be python but then we'd need to add some new ci python deps that we just install for ci purposes and that seems like a different change.

What are deps needed here?

FWIW,
We already do this:

- name: Setup Python
uses: actions/setup-python@v4
with:
python-version-file: ".python-version"
- name: Install PyYAML
run: pip install PyYAML
- name: Generate help text
id: parse_commands
run: python .github/scripts/get_help_text.py

But preferred would be:
Also we could simply add dependency on setup-python-and-tools, which would install python and uv for you.
Subsequently you can just call

uv run --with some_dep==@sha..123123 yourscript.py

This would be a much cleaner approach - lets do it now?

Copy link
Collaborator

@svij-sc svij-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is heavy to read and parse right now.
Opportunity to greatly simplify this using "matrix strategy" execution support built into github actions.

name: Sharded Unit Tests

on:
  workflow_dispatch:
    inputs:
      shards:
        description: "Number of shards"
        required: true
        default: "5"
        type: number

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4, 5]
    steps:
      - uses: actions/checkout@v4
      - name: Run unit tests
        run: make unit_test --shard="${{ matrix.shard }}"

This just means unit_test will show 5 different "steps" in the execution flow; which as a user we are not losing much out on from when debugging what shard failed. Also, we can get rid of a lot of code here which might become hard to manage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants