Skip to content

Filter by MetricAvailability in get_best_raw_objective_point_with_trial_index#5141

Open
saitcakmak wants to merge 2 commits intofacebook:mainfrom
saitcakmak:export-D99451732
Open

Filter by MetricAvailability in get_best_raw_objective_point_with_trial_index#5141
saitcakmak wants to merge 2 commits intofacebook:mainfrom
saitcakmak:export-D99451732

Conversation

@saitcakmak
Copy link
Copy Markdown
Contributor

Summary:
Goal: Eliminate assumptions around COMPLETED trials always having all metrics available. This is part of the code that would've raised errors if it encountered missing metrics in COMPLETED trials.

get_best_raw_objective_point_with_trial_index filters to COMPLETED trials but assumes all metrics in the optimization config are present in the data. This can fail with a hard ValueError from _pivot_data_with_feasibility when a completed trial has incomplete metric data (e.g., due to partial metric fetches, fetch failures, or metrics added mid-experiment).

This diff adds MetricAvailability filtering after the completed-trials filter to exclude trials with incomplete data before they reach the pivot step. Trials without complete metric data are now silently excluded rather than causing a crash, with a clear error if no trials remain.

Differential Revision: D99451732

…ability (facebook#5140)

Summary:

Goal: Eliminate assumptions around COMPLETED trials always having all metrics available. This is part of the code that would've raised errors if it encountered missing metrics in COMPLETED trials. Changing the return type here seemed like a better move than using carry-forward logic, and fits better with the logic that skips status quo as well (which may lead to confusing indexing when skipping status quo).

How does this new return type fit with `BatchTrial`: `get_trace` takes the best arm out of each batch, so it returns (and did so before this change) one result per trial. Nothing changed.

Changes (claude summary):
`get_trace` previously returned `list[float]` with positional indexing,
which forced callers to fabricate sequential trial indices. This change:

1. Changes the return type to `dict[int, float]` mapping trial_index to
   performance value, so callers get real trial indices.

2. Adds `MetricAvailability` filtering to exclude trials with incomplete
   metric data before pivoting. This prevents the `ValueError("Some
   metrics are not present for all trials and arms")` that
   `_pivot_data_with_feasibility` raises when a completed trial is
   missing any metric (e.g., partial fetches, fetch failures, metrics
   added mid-experiment).

3. Removes the carry-forward expansion loop for abandoned/failed trials.
   These trials are now simply excluded from the returned dict.

Callers updated:
- `UtilityProgressionAnalysis` now shows real trial indices on x-axis
- `pareto_frontier.hypervolume_trace_plot` uses real trial indices
- `benchmark.py` extracts values from dict

Differential Revision: D99448053
…al_index

Summary:
Goal: Eliminate assumptions around COMPLETED trials always having all metrics available. This is part of the code that would've raised errors if it encountered missing metrics in COMPLETED trials.

`get_best_raw_objective_point_with_trial_index` filters to COMPLETED trials but assumes all metrics in the optimization config are present in the data. This can fail with a hard `ValueError` from `_pivot_data_with_feasibility` when a completed trial has incomplete metric data (e.g., due to partial metric fetches, fetch failures, or metrics added mid-experiment).

This diff adds `MetricAvailability` filtering after the completed-trials filter to exclude trials with incomplete data before they reach the pivot step. Trials without complete metric data are now silently excluded rather than causing a crash, with a clear error if no trials remain.

Differential Revision: D99451732
@meta-cla meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Apr 3, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Apr 3, 2026

@saitcakmak has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99451732.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.46154% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.40%. Comparing base (227243c) to head (aed0e8b).

Files with missing lines Patch % Lines
ax/service/utils/best_point.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5141   +/-   ##
=======================================
  Coverage   96.40%   96.40%           
=======================================
  Files         613      613           
  Lines       68191    68200    +9     
=======================================
+ Hits        65740    65749    +9     
  Misses       2451     2451           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants