fix: correctly parse and route implicit boolean flags from ini files (#90)#352
fix: correctly parse and route implicit boolean flags from ini files (#90)#352Sanjith1013 wants to merge 6 commits intoOpenMS:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds boolean-aware parameter handling: ParameterManager.get_boolean_parameters(tool) parses/creates the tool .ini (XML) to list boolean keys; CommandExecutor.run_topp retrieves that list and emits Changes
Sequence Diagram(s)sequenceDiagram
participant CE as CommandExecutor
participant PM as ParameterManager
participant FS as FileSystem
participant TOPP as TOPPTool
CE->>PM: get_boolean_parameters(tool)
PM->>FS: ensure/read tool .ini
FS-->>PM: .ini content
PM->>PM: parse XML, collect ITEM(type="bool") names
PM-->>CE: list of boolean param keys
CE->>CE: iterate parameters
alt param is boolean
alt value true
CE->>CE: append flag `-param`
else value false
CE->>CE: omit param
end
else non-boolean param
CE->>CE: append `-param value` (supports lists / multiline)
end
CE->>TOPP: execute constructed command
TOPP-->>CE: result/exit
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 304-306: The code currently returns an empty list when ini_path
(constructed as Path(self.ini_dir, f"{tool}.ini")) doesn't exist, which silently
makes run_topp() treat booleans as normal values; instead, stop the silent
fallback by returning a clear sentinel or raising an explicit error: replace the
"if not ini_path.exists(): return []" with either raising a FileNotFoundError
(with a descriptive message mentioning the tool and ini_path) or returning None
(and update callers accordingly), and ensure run_topp() detects that
sentinel/exception and handles missing/malformed INI explicitly rather than
reverting to boolean-as-value behavior; apply the same change to the second
occurrence around lines 317-318 so both paths consistently signal a
missing/invalid INI.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8a729bc7-69cc-44d4-b955-94c542d3126a
📒 Files selected for processing (2)
src/workflow/CommandExecutor.pysrc/workflow/ParameterManager.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/workflow/ParameterManager.py (1)
304-306:⚠️ Potential issue | 🟠 MajorDon't silently fall back to the old routing when boolean metadata can't be read.
Both failure paths return
[], which makesrun_topp()treat every boolean as a normal value again. That reintroduces the original bug whenever the.inicannot be created or parsed.💡 Suggested change
- if not self.create_ini(tool): - return [] + if not self.create_ini(tool): + raise FileNotFoundError( + f"Unable to create or find ini metadata for {tool} in {self.ini_dir}" + ) ... - except Exception: - pass # Safely return empty list if XML parsing fails + except (ET.ParseError, OSError) as exc: + raise RuntimeError( + f"Failed to read boolean metadata for {tool} from {ini_path}" + ) from excAlso applies to: 319-320
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/ParameterManager.py` around lines 304 - 306, The current logic in ParameterManager that handles failed .ini creation/parsing (the if not self.create_ini(tool): return [] and the similar block at lines ~319-320) silently returns an empty list which causes run_topp() to treat booleans as normal values; instead, stop the silent fallback by raising a clear exception (e.g., RuntimeError or a custom ConfigurationError) when create_ini(tool) fails or when boolean metadata can't be parsed, include context (tool name and error) in the exception message, and ensure callers (run_topp/wherever these branches are used) will propagate or handle the exception rather than receiving an empty list.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 314-318: The code currently appends only the leaf attribute `name`
for boolean parameters; instead build the full hierarchical parameter path
(e.g., "algorithm:common:flag") by walking from each ITEM element upward through
its ancestor parameter/section nodes, collecting each ancestor's `name`
attribute, reversing that list and joining with ":" to produce the full key, and
append that full key to `bool_params` so `run_topp()` (and CommandExecutor) can
match exact hierarchical keys.
---
Duplicate comments:
In `@src/workflow/ParameterManager.py`:
- Around line 304-306: The current logic in ParameterManager that handles failed
.ini creation/parsing (the if not self.create_ini(tool): return [] and the
similar block at lines ~319-320) silently returns an empty list which causes
run_topp() to treat booleans as normal values; instead, stop the silent fallback
by raising a clear exception (e.g., RuntimeError or a custom ConfigurationError)
when create_ini(tool) fails or when boolean metadata can't be parsed, include
context (tool name and error) in the exception message, and ensure callers
(run_topp/wherever these branches are used) will propagate or handle the
exception rather than receiving an empty list.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 89e8d1f2-543f-4e18-9d72-9254b5d83645
📒 Files selected for processing (1)
src/workflow/ParameterManager.py
|
Hi @t0mdavid-m! |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/workflow/ParameterManager.py (2)
6-6: Consider usingdefusedxmlfor defense in depth.While the
.inifiles are generated locally by TOPP tools and not from untrusted sources, usingdefusedxmlis a best practice that protects against XML attacks (XXE, billion laughs, etc.) if the attack surface ever changes.♻️ Suggested change
-import xml.etree.ElementTree as ET +import defusedxml.ElementTree as ETYou'll need to add
defusedxmlto your dependencies.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/ParameterManager.py` at line 6, Replace the use of the standard xml.etree.ElementTree parser in ParameterManager.py with the safe defusedxml variant: import defusedxml.ElementTree as ET (or from defusedxml import ElementTree as ET) and update any XML parsing calls that reference ET (e.g., any parse()/fromstring() uses) to continue working with the defusedxml API; also add defusedxml to the project dependencies (requirements / setup) so the package is installed.
342-345: Log exceptions instead of silently swallowing them.The broad
except Exception: passhides parsing errors that could help diagnose issues. At minimum, consider logging the exception for debugging purposes.♻️ Suggested change
- except Exception: - pass # Safely return empty list if XML parsing fails - - return [] + except (ET.ParseError, OSError) as exc: + # Log for debugging but don't crash the workflow + st.warning(f"Could not parse boolean parameters for {tool}: {exc}") + return []This narrows the exception types and provides visibility into failures without disrupting the workflow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/ParameterManager.py` around lines 342 - 345, The broad "except Exception: pass" in ParameterManager.py should be replaced with targeted exception handling and logging: catch specific XML parsing errors (e.g., xml.etree.ElementTree.ParseError or the specific parser exception used) instead of Exception, log the exception details using the module/class logger (use the existing logger or create one with logging.getLogger(__name__)), and still return an empty list on failure; update the except block around the XML parsing code (the block currently containing "except Exception: pass") to log a clear message with the exception information and narrow the exception type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 332-340: The boolean key-cleanup in ParameterManager.py currently
only strips the tool prefix (tool + ":"), leaving an instance prefix like "1:"
(e.g., "1:algorithm:...") which mismatches JSON keys; update the cleaned_params
computation in the function that processes bool_params (the list comprehension
that builds cleaned_params) to remove the combined prefix f"{tool}:1:" (and fall
back to removing f"{tool}:") so resulting keys match get_parameters_from_ini()
and CommandExecutor checks; also in the surrounding try/except (the exception
handler catching XML parsing errors) replace the silent pass with a logged error
(e.g., logger.error or print) that includes the tool name and exception message
to aid debugging.
---
Nitpick comments:
In `@src/workflow/ParameterManager.py`:
- Line 6: Replace the use of the standard xml.etree.ElementTree parser in
ParameterManager.py with the safe defusedxml variant: import
defusedxml.ElementTree as ET (or from defusedxml import ElementTree as ET) and
update any XML parsing calls that reference ET (e.g., any parse()/fromstring()
uses) to continue working with the defusedxml API; also add defusedxml to the
project dependencies (requirements / setup) so the package is installed.
- Around line 342-345: The broad "except Exception: pass" in ParameterManager.py
should be replaced with targeted exception handling and logging: catch specific
XML parsing errors (e.g., xml.etree.ElementTree.ParseError or the specific
parser exception used) instead of Exception, log the exception details using the
module/class logger (use the existing logger or create one with
logging.getLogger(__name__)), and still return an empty list on failure; update
the except block around the XML parsing code (the block currently containing
"except Exception: pass") to log a clear message with the exception information
and narrow the exception type.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9f792d1a-417b-4aaf-ab41-3ebb408390a4
📒 Files selected for processing (1)
src/workflow/ParameterManager.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/workflow/ParameterManager.py (1)
305-307:⚠️ Potential issue | 🟠 MajorDon't return
[]when boolean metadata lookup fails.If
.inigeneration fails or XML parsing errors out, this returns the same value as “the tool has no boolean params”.CommandExecutor.run_topp()then falls back to explicit boolean routing and can emit-flag falseagain, recreating issue#90for first-run or malformed-ini cases. Please signal this failure distinctly and let the caller handle it explicitly; if you keep recovery here, narrow the catch to parse/IO failures instead ofException.Also applies to: 342-345
🧹 Nitpick comments (1)
src/workflow/ParameterManager.py (1)
6-6: Consider hardening XML parsing if untrusted.inifiles could reach this directory.While
ET.parse()is safe for parsing generated OpenMS.inifiles (created via subprocess on line 47),xml.etree.ElementTreeis vulnerable to billion laughs and XXE attacks if a crafted.inifile were placed inworkflow_dir/ini/. If this is a concern for your use case,defusedxml.ElementTreeis the safer default—however, this would require addingdefusedxmlas a new dependency, so evaluate the threat model first.Also applies to: 313-314
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/workflow/ParameterManager.py` at line 6, The XML parsing uses xml.etree.ElementTree (imported as ET) which is vulnerable to XXE/billion‑laughs; replace it with defusedxml.ElementTree (install defusedxml) or otherwise use defusedxml.ElementTree.parse / defusedxml.ElementTree.fromstring wherever ET.parse or ET.fromstring are called (search for ET.parse and ET.fromstring in ParameterManager.py, including the usage around the ini file creation call and the locations referenced near the end of the file) so untrusted .ini files cannot trigger XML entity attacks; update the import line to use defusedxml.ElementTree as ET and adjust any ET usage accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/workflow/ParameterManager.py`:
- Line 6: The XML parsing uses xml.etree.ElementTree (imported as ET) which is
vulnerable to XXE/billion‑laughs; replace it with defusedxml.ElementTree
(install defusedxml) or otherwise use defusedxml.ElementTree.parse /
defusedxml.ElementTree.fromstring wherever ET.parse or ET.fromstring are called
(search for ET.parse and ET.fromstring in ParameterManager.py, including the
usage around the ini file creation call and the locations referenced near the
end of the file) so untrusted .ini files cannot trigger XML entity attacks;
update the import line to use defusedxml.ElementTree as ET and adjust any ET
usage accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6e2ab6b7-8be8-47f5-9e14-4210ad10740c
📒 Files selected for processing (1)
src/workflow/ParameterManager.py
|
Thank you for your contributions. Could you please add some tests where you read from a ini that contains all parameter types and make sure they parse correctly? |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 350-354: The except block catching ET.ParseError in
ParameterManager.py references an undefined logger and contains unreachable code
after a raise; replace the logger.error(...) call with st.error(...) to report
the XML parse error (streamlit is already imported), remove the unreachable
print(...) and pass lines, and ensure the exception is re-raised so callers can
handle it; update the except block that wraps ET.ParseError in the
function/method handling boolean parameter parsing accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0b64f424-8657-48c7-a7d6-3624d4a4e715
📒 Files selected for processing (2)
requirements.txtsrc/workflow/ParameterManager.py
|
Hi @t0mdavid-m, Thank you for the feedback! Yes, I have added a comprehensive test (test_ParameterManager.py) that mocks an .ini file containing all parameter types (strings, ints, floats, booleans, and lists), including nested nodes. The test successfully verifies that get_boolean_parameters() strictly isolates and returns only the boolean parameters, ignoring the rest, while perfectly maintaining the expected 1:TestTool:algorithm:nested_bool path formatting. I have also addressed all of the CodeRabbit AI review comments in the latest commit: Security: Migrated the XML parser from standard xml.etree to defusedxml to prevent XXE (billion-laughs) vulnerabilities. Added defusedxml to requirements.txt. Error Handling: Removed the silent return [] fallbacks. The get_boolean_parameters() method now raises explicit FileNotFoundError, RuntimeError, or ET.ParseError exceptions with detailed logging if the TOPP tool's .ini file fails to generate or parse. Documentation: Added comprehensive docstrings to satisfy the 80% coverage threshold. |
|
Hi @t0mdavid-m, |
Fixes #90.
Hi @t0mdavid-m! I am currently preparing my GSoC 2026 proposal for Project C1 and wanted to tackle this stalled issue to get familiar with the parameter routing architecture.
As discussed in the original issue thread, the previous PR (#176) hit a roadblock because the internal C++ pyOpenMS wrapper maps both booleans and strings to STRING_VALUE, making them indistinguishable at runtime. I have implemented your exact suggested solution to bypass this by inspecting the generated .ini files directly.
Changes made:
ParameterManager.py: Added a new get_boolean_parameters() method. This uses Python's built-in xml.etree.ElementTree to parse the tool's .ini file and extract a strict list of parameters explicitly defined as type="bool".
CommandExecutor.py: Updated run_topp() to query the ParameterManager for boolean keys before constructing the command. If a key is a confirmed boolean, it applies implicit routing: appending -flag if True, and omitting it entirely if False. Standard explicit routing is preserved for strings and numeric values.
Tested locally using the standard TOPP workflow. FeatureFinderMetabo now correctly omits the masstrace_snr_filtering flag when set to false, preventing the Trailing text argument(s) '[false]' given crash.
Summary by CodeRabbit
Bug Fixes
New Features