Add disable_shell_tool option to CodexCoder for MCP-only evaluation#39
Add disable_shell_tool option to CodexCoder for MCP-only evaluation#39justaddcoffee wants to merge 1 commit intofix-gemini-cli-invocationfrom
Conversation
When disable_shell_tool is set in the coder config, Codex runs with
--disable shell_tool flag, preventing filesystem access via bash commands.
This enables fair MCP-only evaluations where Codex can only use MCP tools.
Usage in eval config:
coders:
codex:
disable_shell_tool: true
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Claude finished @justaddcoffee's task —— View job Code Review Complete ✅Reviewed PR #39: Add disable_shell_tool option to CodexCoder for MCP-only evaluation Todo List:
Review SummaryThis PR adds a configurable Code Quality & Best Practices ✅Strengths:
Areas for Improvement:
Potential Bugs & Issues
|
Summary
Adds a configurable
disable_shell_tooloption to CodexCoder that prevents filesystem access via bash commands, enabling fair MCP-only evaluations.Note: This PR is based on the
fix-gemini-cli-invocationbranch which contains all previous Codex CLI fixes (MCP support, proper invocation, 401 fix, etc.). This should be merged after that branch is merged, or both can be merged together.Changes
disable_shell_toolattribute to CodexCoder class--disable shell_toolflag instead of--dangerously-bypass-approvals-and-sandboxcreate_coder()andrun_single_eval()to pass coder-specific options from YAML configUsage
Motivation
During MCP literature evaluations, we discovered Codex was using shell commands (
rg,sed,find) to access the filesystem and read expected answers from test case files. This made the evaluation results invalid.With
disable_shell_tool: true, Codex can only use MCP tools to retrieve information, ensuring a fair comparison with other agents.🤖 Generated with Claude Code