Constraint failure reward override#865
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdded a constraint-reward override across config, env, and handler layers: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 104-111: Update the abstract BaseGym.step signature to match
CloudAIGymEnv.step by adding the constraint_check_reward: float = -1.0 parameter
and keeping the same return type and typing (Tuple[list, float, bool, dict]);
modify the BaseGym.step method declaration and its docstring to include and
document constraint_check_reward so subclasses satisfy the contract and callers
(e.g., handlers using this arg) remain type-correct—look for the BaseGym class
and its step method and change the signature from step(self, action: Any) ->
Tuple[...] to step(self, action: Any, constraint_check_reward: float = -1.0) ->
Tuple[...].
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e9050d73-b71f-4dc8-ac8d-fb1b5ee1558c
📒 Files selected for processing (4)
src/cloudai/cli/handlers.pysrc/cloudai/configurator/base_agent.pysrc/cloudai/configurator/cloudai_gym.pytests/test_cloudaigym.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/configurator/base_gym.py`:
- Line 70: The abstract method BaseGym.step currently requires
constraint_check_reward with no default while CloudAIGymEnv.step defines
constraint_check_reward: float = -1.0 and call sites sometimes call
env.step(action) — update the BaseGym.step signature to provide the same default
(e.g., constraint_check_reward: float = -1.0) so the abstract contract matches
concrete CloudAIGymEnv.step and existing call sites; adjust any type hints or
docstrings referencing BaseGym.step to reflect the default as well.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3015b63c-5872-4715-999c-5d1d9634aafa
📒 Files selected for processing (2)
src/cloudai/configurator/base_gym.pytests/test_cloudaigym.py
srivatsankrishnan
left a comment
There was a problem hiding this comment.
Looks good. Thanks Alex for this.
|
@podkidyshev can you review this as well? |
|
Lemme know when merge |
Summary
Adds an agent config flag to override the default -1.0 reward.
Example TOML Usage
AIConfigurator Example
To induce a constraint failure, a "dummy" constraint was added.
Default behavior (no override):
With override applied:
Test Plan
Added constraint failure test.
This checks that when no override is given the default reward = -1.0.
When a custom override (-2.5) is set, the returned reward = -2.5