Filter by MetricAvailability in get_best_raw_objective_point_with_trial_index#5141
Open
saitcakmak wants to merge 2 commits intofacebook:mainfrom
Open
Filter by MetricAvailability in get_best_raw_objective_point_with_trial_index#5141saitcakmak wants to merge 2 commits intofacebook:mainfrom
saitcakmak wants to merge 2 commits intofacebook:mainfrom
Conversation
…ability (facebook#5140) Summary: Goal: Eliminate assumptions around COMPLETED trials always having all metrics available. This is part of the code that would've raised errors if it encountered missing metrics in COMPLETED trials. Changing the return type here seemed like a better move than using carry-forward logic, and fits better with the logic that skips status quo as well (which may lead to confusing indexing when skipping status quo). How does this new return type fit with `BatchTrial`: `get_trace` takes the best arm out of each batch, so it returns (and did so before this change) one result per trial. Nothing changed. Changes (claude summary): `get_trace` previously returned `list[float]` with positional indexing, which forced callers to fabricate sequential trial indices. This change: 1. Changes the return type to `dict[int, float]` mapping trial_index to performance value, so callers get real trial indices. 2. Adds `MetricAvailability` filtering to exclude trials with incomplete metric data before pivoting. This prevents the `ValueError("Some metrics are not present for all trials and arms")` that `_pivot_data_with_feasibility` raises when a completed trial is missing any metric (e.g., partial fetches, fetch failures, metrics added mid-experiment). 3. Removes the carry-forward expansion loop for abandoned/failed trials. These trials are now simply excluded from the returned dict. Callers updated: - `UtilityProgressionAnalysis` now shows real trial indices on x-axis - `pareto_frontier.hypervolume_trace_plot` uses real trial indices - `benchmark.py` extracts values from dict Differential Revision: D99448053
…al_index Summary: Goal: Eliminate assumptions around COMPLETED trials always having all metrics available. This is part of the code that would've raised errors if it encountered missing metrics in COMPLETED trials. `get_best_raw_objective_point_with_trial_index` filters to COMPLETED trials but assumes all metrics in the optimization config are present in the data. This can fail with a hard `ValueError` from `_pivot_data_with_feasibility` when a completed trial has incomplete metric data (e.g., due to partial metric fetches, fetch failures, or metrics added mid-experiment). This diff adds `MetricAvailability` filtering after the completed-trials filter to exclude trials with incomplete data before they reach the pivot step. Trials without complete metric data are now silently excluded rather than causing a crash, with a clear error if no trials remain. Differential Revision: D99451732
|
@saitcakmak has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99451732. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5141 +/- ##
=======================================
Coverage 96.40% 96.40%
=======================================
Files 613 613
Lines 68191 68200 +9
=======================================
+ Hits 65740 65749 +9
Misses 2451 2451 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Goal: Eliminate assumptions around COMPLETED trials always having all metrics available. This is part of the code that would've raised errors if it encountered missing metrics in COMPLETED trials.
get_best_raw_objective_point_with_trial_indexfilters to COMPLETED trials but assumes all metrics in the optimization config are present in the data. This can fail with a hardValueErrorfrom_pivot_data_with_feasibilitywhen a completed trial has incomplete metric data (e.g., due to partial metric fetches, fetch failures, or metrics added mid-experiment).This diff adds
MetricAvailabilityfiltering after the completed-trials filter to exclude trials with incomplete data before they reach the pivot step. Trials without complete metric data are now silently excluded rather than causing a crash, with a clear error if no trials remain.Differential Revision: D99451732