Parallelize the unit and integration tests by kmontemayor2-sc · Pull Request #520 · Snapchat/GiGL

kmontemayor2-sc · 2026-02-27T03:39:07Z

Scope of work done

Summary

Adds parallel test sharding to CI, splitting Python unit tests and integration tests across 4 Cloud Build shards running concurrently. Slow
test modules are explicitly pinned to specific shards via ordered tuples, preventing hash-collision hotspots. Unpinned modules are assigned via
SHA-256 hash.

New run-sharded-cloud-build GitHub Action that launches N parallel Cloud Build jobs (one per shard), streams logs with [shard_N] prefixes,
and reports pass/fail with build URLs
Sharding infrastructure in test_utils.py: _get_shard_for_module() (position-based for pinned, SHA-256 for others), _filter_tests_by_shard(),
new CLI args --shard_index/--total_shards
Pinned module constants in tests/unit/main.py (5 modules, ~88% of runtime) and tests/integration/main.py (8 modules, ~96% of runtime), with
measured durations as comments
CI workflow updates: on-pr-comment.yml and on-pr-merge.yml now use sharded execution with 4 shards, timeout reduced from 120 to 60 minutes
Comprehensive tests in test_sharding_test.py: basic sharding (coverage, no-overlap, determinism), pinning behavior, and real-discovery tests
that scan tests/unit/ to verify sharded == unsharded

As a followup, we should probably move that inlined bash script to be python but then we'd need to add some new ci python deps that we just install for ci purposes and that seems like a different change.

NOTE: I tested this in a slightly different on: push path, I'll test this change when it gets merged and if the ci setup breaks I can revert

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

kmontemayor2-sc · 2026-02-27T03:39:14Z

/unit_test_py

kmontemayor2-sc · 2026-02-27T03:39:20Z

/integration_test

github-actions · 2026-02-27T03:39:25Z

GiGL Automation

@ 03:39:24UTC : 🔄 Python Unit Test started.

@ 04:55:30UTC : ✅ Workflow completed successfully.

github-actions · 2026-02-27T03:39:30Z

GiGL Automation

@ 03:39:30UTC : 🔄 Integration Test started.

@ 05:02:55UTC : ✅ Workflow completed successfully.

svij-sc

Have we tried ways to parallelize the tests on a single machine https://github.com/mtreinish/stestr ?
or various pytest add-ons?

This adds a lot of complexity :\ - want to be sure we have considered alternatives.

kmontemayor2-sc · 2026-03-03T16:25:34Z

Have we tried ways to parallelize the tests on a single machine https://github.com/mtreinish/stestr ?
or various pytest add-ons?

So we actually already have some sort of single machine parallelization 1 but last I checked it just doesn't allow our tests to run properly.

Though we have a bigger problem in that the runners we use for CICD are on the small size and are prone to OOMing if the process tree gets too big (this happened to me quite a bit with the graph_store_integration_test), so even if we do enable single-machine parallelization it wouldn't really help us in CI/CD.

For local, I don't think parallel tests are that useful either, as debugging the parallel. test traces is quite difficult and in practice I don't usually run all of the tests locally, just the ones that should be pertinent to my changes.

Then, I'd push to github and /unit_test_py and run all of the tests, to make sure that there aren't unexpected breakages.

This change is to support the above step, and I agree isn't strictly necessary, but if we want parallel tests on CI/CD I don't really see another way forward here, unless we want to get custom bigger runners but even then debugging will be difficult.

github-actions · 2026-03-03T16:25:50Z

GiGL Automation

@ 16:25:50UTC : 🔄 Python Unit Test started.

@ 17:40:29UTC : ✅ Workflow completed successfully.

svij-sc · 2026-03-04T23:48:53Z

As a followup, we should probably move that inlined bash script to be python but then we'd need to add some new ci python deps that we just install for ci purposes and that seems like a different change.

What are deps needed here?

FWIW,
We already do this:

GiGL/.github/workflows/on-pr-comment.yml

Lines 24 to 34 in 1efbb31

    
           - name: Setup Python 
        
             uses: actions/setup-python@v4 
        
             with: 
        
               python-version-file: ".python-version" 
        
           - name: Install PyYAML 
        
             run: pip install PyYAML 
        
           - name: Generate help text 
        
             id: parse_commands 
        
             run: python .github/scripts/get_help_text.py

But preferred would be:
Also we could simply add dependency on setup-python-and-tools, which would install python and uv for you.
Subsequently you can just call

uv run --with some_dep==@sha..123123 yourscript.py

This would be a much cleaner approach - lets do it now?

svij-sc

The code is heavy to read and parse right now.
Opportunity to greatly simplify this using "matrix strategy" execution support built into github actions.

name: Sharded Unit Tests

on:
  workflow_dispatch:
    inputs:
      shards:
        description: "Number of shards"
        required: true
        default: "5"
        type: number

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4, 5]
    steps:
      - uses: actions/checkout@v4
      - name: Run unit tests
        run: make unit_test --shard="${{ matrix.shard }}"

This just means unit_test will show 5 different "steps" in the execution flow; which as a user we are not losing much out on from when debugging what shard failed. Also, we can get rid of a lot of code here which might become hard to manage.

Try to parallelize the unit tests

615d448

kmonte added 11 commits February 27, 2026 04:00

maybe

a19cdd4

hmmm

89dfcd3

test

34b797e

fix

e7da173

bleh

adce2cd

no checkout

a6bdb8b

update

b9ab9d6

update

5feab22

better logs

9c00855

cleanup

c9a9051

maybe ready

97feb7d

kmontemayor2-sc changed the title ~~Try to parallelize the unit tests~~ Parallelize the unit and integration tests Feb 27, 2026

update

3eb6ea7

svij-sc reviewed Mar 3, 2026

View reviewed changes

svij-sc reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize the unit and integration tests#520

Parallelize the unit and integration tests#520
kmontemayor2-sc wants to merge 13 commits intomainfrom
kmonte/parallel-unit-tests

kmontemayor2-sc commented Feb 27, 2026 •

edited

Loading

Uh oh!

kmontemayor2-sc commented Feb 27, 2026

Uh oh!

kmontemayor2-sc commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

svij-sc left a comment •

edited

Loading

Uh oh!

kmontemayor2-sc commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

svij-sc commented Mar 4, 2026 •

edited

Loading

Uh oh!

svij-sc left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kmontemayor2-sc commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kmontemayor2-sc commented Feb 27, 2026

Uh oh!

kmontemayor2-sc commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

svij-sc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kmontemayor2-sc commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

svij-sc commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

svij-sc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kmontemayor2-sc commented Feb 27, 2026 •

edited

Loading

github-actions bot commented Feb 27, 2026 •

edited

Loading

github-actions bot commented Feb 27, 2026 •

edited

Loading

svij-sc left a comment •

edited

Loading

github-actions bot commented Mar 3, 2026 •

edited

Loading

svij-sc commented Mar 4, 2026 •

edited

Loading

svij-sc left a comment •

edited

Loading