Skip to content

Fix in-sample arm pool exhaustion from FAILED LILO labeling trials (#5145)#5145

Closed
ItsMrLin wants to merge 1 commit intofacebook:mainfrom
ItsMrLin:export-D99611303
Closed

Fix in-sample arm pool exhaustion from FAILED LILO labeling trials (#5145)#5145
ItsMrLin wants to merge 1 commit intofacebook:mainfrom
ItsMrLin:export-D99611303

Conversation

@ItsMrLin
Copy link
Copy Markdown
Contributor

@ItsMrLin ItsMrLin commented Apr 5, 2026

Summary:

InSampleUniformGenerator selects existing arms for LILO labeling by
drawing from the generated_points pool constructed in
RandomAdapter._gen(). This pool started from
arms_by_signature_for_deduplication, which excludes arm signatures
from any FAILED trial.

Because LILO labeling trials borrow arms from regular BO trials (same
signatures), a FAILED labeling trial incorrectly removes the original
arm from the selection pool — even though it still exists in a
non-FAILED trial. Within a single LILO labeling loop run, failed
iterations accumulate and progressively poison the pool until no arms
remain, crashing with:

ValueError: Cannot select 2 arms: only 0 eligible arms available

The fix: for in-sample generators, start from arms_by_signature
(all arms) instead of arms_by_signature_for_deduplication. The
existing expecting_sigs filter already handles the real restriction
(only data-expecting, non-abandoned arms), so the FAILED-arm exclusion
was just an accidental side effect of piggybacking on the dedup
infrastructure.

Reviewed By: bletham

Differential Revision: D99611303

@meta-cla meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Apr 5, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Apr 5, 2026

@ItsMrLin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99611303.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.39%. Comparing base (53be145) to head (d330824).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5145   +/-   ##
=======================================
  Coverage   96.39%   96.39%           
=======================================
  Files         613      613           
  Lines       68295    68321   +26     
=======================================
+ Hits        65830    65860   +30     
+ Misses       2465     2461    -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@meta-codesync meta-codesync bot changed the title Fix in-sample arm pool exhaustion from FAILED LILO labeling trials Fix in-sample arm pool exhaustion from FAILED LILO labeling trials (#5145) Apr 6, 2026
ItsMrLin added a commit to ItsMrLin/Ax that referenced this pull request Apr 6, 2026
…acebook#5145)

Summary:


`InSampleUniformGenerator` selects existing arms for LILO labeling by
drawing from the `generated_points` pool constructed in
`RandomAdapter._gen()`.  This pool started from
`arms_by_signature_for_deduplication`, which excludes arm signatures
from *any* FAILED trial.

Because LILO labeling trials borrow arms from regular BO trials (same
signatures), a FAILED labeling trial incorrectly removes the original
arm from the selection pool — even though it still exists in a
non-FAILED trial.  Within a single LILO labeling loop run, failed
iterations accumulate and progressively poison the pool until no arms
remain, crashing with:

    ValueError: Cannot select 2 arms: only 0 eligible arms available

The fix: for in-sample generators, start from `arms_by_signature`
(all arms) instead of `arms_by_signature_for_deduplication`.  The
existing `expecting_sigs` filter already handles the real restriction
(only data-expecting, non-abandoned arms), so the FAILED-arm exclusion
was just an accidental side effect of piggybacking on the dedup
infrastructure.

Reviewed By: bletham

Differential Revision: D99611303
ItsMrLin added a commit to ItsMrLin/Ax that referenced this pull request Apr 6, 2026
…acebook#5145)

Summary:
Pull Request resolved: facebook#5145

Pull Request resolved: https://github.com/facebook/Ax/pull/XXXX

`InSampleUniformGenerator` selects existing arms for LILO labeling by
drawing from the `generated_points` pool constructed in
`RandomAdapter._gen()`.  This pool started from
`arms_by_signature_for_deduplication`, which excludes arm signatures
from *any* FAILED trial.

Because LILO labeling trials borrow arms from regular BO trials (same
signatures), a FAILED labeling trial incorrectly removes the original
arm from the selection pool — even though it still exists in a
non-FAILED trial.  Within a single LILO labeling loop run, failed
iterations accumulate and progressively poison the pool until no arms
remain, crashing with:

    ValueError: Cannot select 2 arms: only 0 eligible arms available

The fix: for in-sample generators, start from `arms_by_signature`
(all arms) instead of `arms_by_signature_for_deduplication`.  The
existing `expecting_sigs` filter already handles the real restriction
(only data-expecting, non-abandoned arms), so the FAILED-arm exclusion
was just an accidental side effect of piggybacking on the dedup
infrastructure.

Reviewed By: bletham

Differential Revision: D99611303
ItsMrLin added a commit to ItsMrLin/Ax that referenced this pull request Apr 6, 2026
…acebook#5145)

Summary:


`InSampleUniformGenerator` selects existing arms for LILO labeling by
drawing from the `generated_points` pool constructed in
`RandomAdapter._gen()`.  This pool started from
`arms_by_signature_for_deduplication`, which excludes arm signatures
from *any* FAILED trial.

Because LILO labeling trials borrow arms from regular BO trials (same
signatures), a FAILED labeling trial incorrectly removes the original
arm from the selection pool — even though it still exists in a
non-FAILED trial.  Within a single LILO labeling loop run, failed
iterations accumulate and progressively poison the pool until no arms
remain, crashing with:

    ValueError: Cannot select 2 arms: only 0 eligible arms available

The fix: for in-sample generators, start from `arms_by_signature`
(all arms) instead of `arms_by_signature_for_deduplication`.  The
existing `expecting_sigs` filter already handles the real restriction
(only data-expecting, non-abandoned arms), so the FAILED-arm exclusion
was just an accidental side effect of piggybacking on the dedup
infrastructure.

Reviewed By: bletham

Differential Revision: D99611303
ItsMrLin added a commit to ItsMrLin/Ax that referenced this pull request Apr 6, 2026
…acebook#5145)

Summary:


`InSampleUniformGenerator` selects existing arms for LILO labeling by
drawing from the `generated_points` pool constructed in
`RandomAdapter._gen()`.  This pool started from
`arms_by_signature_for_deduplication`, which excludes arm signatures
from *any* FAILED trial.

Because LILO labeling trials borrow arms from regular BO trials (same
signatures), a FAILED labeling trial incorrectly removes the original
arm from the selection pool — even though it still exists in a
non-FAILED trial.  Within a single LILO labeling loop run, failed
iterations accumulate and progressively poison the pool until no arms
remain, crashing with:

    ValueError: Cannot select 2 arms: only 0 eligible arms available

The fix: for in-sample generators, start from `arms_by_signature`
(all arms) instead of `arms_by_signature_for_deduplication`.  The
existing `expecting_sigs` filter already handles the real restriction
(only data-expecting, non-abandoned arms), so the FAILED-arm exclusion
was just an accidental side effect of piggybacking on the dedup
infrastructure.

Reviewed By: bletham

Differential Revision: D99611303
…acebook#5145)

Summary:


`InSampleUniformGenerator` selects existing arms for LILO labeling by
drawing from the `generated_points` pool constructed in
`RandomAdapter._gen()`.  This pool started from
`arms_by_signature_for_deduplication`, which excludes arm signatures
from *any* FAILED trial.

Because LILO labeling trials borrow arms from regular BO trials (same
signatures), a FAILED labeling trial incorrectly removes the original
arm from the selection pool — even though it still exists in a
non-FAILED trial.  Within a single LILO labeling loop run, failed
iterations accumulate and progressively poison the pool until no arms
remain, crashing with:

    ValueError: Cannot select 2 arms: only 0 eligible arms available

The fix: for in-sample generators, start from `arms_by_signature`
(all arms) instead of `arms_by_signature_for_deduplication`.  The
existing `expecting_sigs` filter already handles the real restriction
(only data-expecting, non-abandoned arms), so the FAILED-arm exclusion
was just an accidental side effect of piggybacking on the dedup
infrastructure.

Reviewed By: bletham

Differential Revision: D99611303
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Apr 6, 2026

This pull request has been merged in e942453.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants