Skip to content

fix: correctly parse and route implicit boolean flags from ini files (#90)#352

Open
Sanjith1013 wants to merge 6 commits intoOpenMS:mainfrom
Sanjith1013:fix-boolean-flags-issue-90
Open

fix: correctly parse and route implicit boolean flags from ini files (#90)#352
Sanjith1013 wants to merge 6 commits intoOpenMS:mainfrom
Sanjith1013:fix-boolean-flags-issue-90

Conversation

@Sanjith1013
Copy link

@Sanjith1013 Sanjith1013 commented Mar 12, 2026

Fixes #90.

Hi @t0mdavid-m! I am currently preparing my GSoC 2026 proposal for Project C1 and wanted to tackle this stalled issue to get familiar with the parameter routing architecture.

As discussed in the original issue thread, the previous PR (#176) hit a roadblock because the internal C++ pyOpenMS wrapper maps both booleans and strings to STRING_VALUE, making them indistinguishable at runtime. I have implemented your exact suggested solution to bypass this by inspecting the generated .ini files directly.

Changes made:

ParameterManager.py: Added a new get_boolean_parameters() method. This uses Python's built-in xml.etree.ElementTree to parse the tool's .ini file and extract a strict list of parameters explicitly defined as type="bool".

CommandExecutor.py: Updated run_topp() to query the ParameterManager for boolean keys before constructing the command. If a key is a confirmed boolean, it applies implicit routing: appending -flag if True, and omitting it entirely if False. Standard explicit routing is preserved for strings and numeric values.

Tested locally using the standard TOPP workflow. FeatureFinderMetabo now correctly omits the masstrace_snr_filtering flag when set to false, preventing the Trailing text argument(s) '[false]' given crash.

Summary by CodeRabbit

  • Bug Fixes

    • Boolean parameters now emit flags only when true, preventing false flags.
    • Multi-line string parameter values are handled correctly.
    • Custom parameters accept list values and respect boolean flag semantics.
    • Existing handling for non-boolean, empty, or null parameters preserved.
  • New Features

    • Added a way to discover which parameters are boolean for each tool.

@coderabbitai
Copy link

coderabbitai bot commented Mar 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds boolean-aware parameter handling: ParameterManager.get_boolean_parameters(tool) parses/creates the tool .ini (XML) to list boolean keys; CommandExecutor.run_topp retrieves that list and emits -key flags only when a boolean parameter is true, omitting them when false; previous handling for non-boolean, list, and multiline values is preserved.

Changes

Cohort / File(s) Summary
Boolean Parameter Detection
src/workflow/ParameterManager.py
Added import defusedxml.ElementTree as ET and new get_boolean_parameters(self, tool: str) -> list which ensures .ini exists, parses XML ITEMs with type="bool", builds hierarchical keys, strips top-level tool prefix, and returns boolean parameter keys (falls back to [] on parse errors).
Command Construction / Execution
src/workflow/CommandExecutor.py
Integrated boolean-aware logic into run_topp: fetches boolean keys from ParameterManager, appends -key only when boolean value is true and omits when false; non-boolean params remain as -key value, with support for list flattening and multiline splitting.
Dependencies
requirements.txt
Added defusedxml==0.7.1 to support safe XML parsing in ParameterManager.py.

Sequence Diagram(s)

sequenceDiagram
    participant CE as CommandExecutor
    participant PM as ParameterManager
    participant FS as FileSystem
    participant TOPP as TOPPTool

    CE->>PM: get_boolean_parameters(tool)
    PM->>FS: ensure/read tool .ini
    FS-->>PM: .ini content
    PM->>PM: parse XML, collect ITEM(type="bool") names
    PM-->>CE: list of boolean param keys

    CE->>CE: iterate parameters
    alt param is boolean
        alt value true
            CE->>CE: append flag `-param`
        else value false
            CE->>CE: omit param
        end
    else non-boolean param
        CE->>CE: append `-param value` (supports lists / multiline)
    end

    CE->>TOPP: execute constructed command
    TOPP-->>CE: result/exit
Loading

Possibly related PRs

Poem

🐰 I dug through ini leaves and lo,

True flags hop on, false ones go,
Lines split tidy, lists unwind,
Commands now skip what's unkind,
A tiny rabbit cheers the flow!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main change: implementing correct parsing and routing of implicit boolean flags from ini files, directly addressing the core objective.
Linked Issues check ✅ Passed The PR successfully implements all coding requirements from issue #90: parsing ini files for boolean parameter types, routing booleans implicitly (flag when true, omitted when false), and avoiding explicit true/false routing.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #90: adding get_boolean_parameters() method, updating CommandExecutor to use it, and adding the defusedxml dependency for safe XML parsing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 304-306: The code currently returns an empty list when ini_path
(constructed as Path(self.ini_dir, f"{tool}.ini")) doesn't exist, which silently
makes run_topp() treat booleans as normal values; instead, stop the silent
fallback by returning a clear sentinel or raising an explicit error: replace the
"if not ini_path.exists(): return []" with either raising a FileNotFoundError
(with a descriptive message mentioning the tool and ini_path) or returning None
(and update callers accordingly), and ensure run_topp() detects that
sentinel/exception and handles missing/malformed INI explicitly rather than
reverting to boolean-as-value behavior; apply the same change to the second
occurrence around lines 317-318 so both paths consistently signal a
missing/invalid INI.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8a729bc7-69cc-44d4-b955-94c542d3126a

📥 Commits

Reviewing files that changed from the base of the PR and between c81fee6 and 0fd9cec.

📒 Files selected for processing (2)
  • src/workflow/CommandExecutor.py
  • src/workflow/ParameterManager.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/workflow/ParameterManager.py (1)

304-306: ⚠️ Potential issue | 🟠 Major

Don't silently fall back to the old routing when boolean metadata can't be read.

Both failure paths return [], which makes run_topp() treat every boolean as a normal value again. That reintroduces the original bug whenever the .ini cannot be created or parsed.

💡 Suggested change
-        if not self.create_ini(tool):
-            return []
+        if not self.create_ini(tool):
+            raise FileNotFoundError(
+                f"Unable to create or find ini metadata for {tool} in {self.ini_dir}"
+            )
...
-        except Exception:
-            pass  # Safely return empty list if XML parsing fails
+        except (ET.ParseError, OSError) as exc:
+            raise RuntimeError(
+                f"Failed to read boolean metadata for {tool} from {ini_path}"
+            ) from exc

Also applies to: 319-320

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/workflow/ParameterManager.py` around lines 304 - 306, The current logic
in ParameterManager that handles failed .ini creation/parsing (the if not
self.create_ini(tool): return [] and the similar block at lines ~319-320)
silently returns an empty list which causes run_topp() to treat booleans as
normal values; instead, stop the silent fallback by raising a clear exception
(e.g., RuntimeError or a custom ConfigurationError) when create_ini(tool) fails
or when boolean metadata can't be parsed, include context (tool name and error)
in the exception message, and ensure callers (run_topp/wherever these branches
are used) will propagate or handle the exception rather than receiving an empty
list.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 314-318: The code currently appends only the leaf attribute `name`
for boolean parameters; instead build the full hierarchical parameter path
(e.g., "algorithm:common:flag") by walking from each ITEM element upward through
its ancestor parameter/section nodes, collecting each ancestor's `name`
attribute, reversing that list and joining with ":" to produce the full key, and
append that full key to `bool_params` so `run_topp()` (and CommandExecutor) can
match exact hierarchical keys.

---

Duplicate comments:
In `@src/workflow/ParameterManager.py`:
- Around line 304-306: The current logic in ParameterManager that handles failed
.ini creation/parsing (the if not self.create_ini(tool): return [] and the
similar block at lines ~319-320) silently returns an empty list which causes
run_topp() to treat booleans as normal values; instead, stop the silent fallback
by raising a clear exception (e.g., RuntimeError or a custom ConfigurationError)
when create_ini(tool) fails or when boolean metadata can't be parsed, include
context (tool name and error) in the exception message, and ensure callers
(run_topp/wherever these branches are used) will propagate or handle the
exception rather than receiving an empty list.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 89e8d1f2-543f-4e18-9d72-9254b5d83645

📥 Commits

Reviewing files that changed from the base of the PR and between 0fd9cec and 8cd956c.

📒 Files selected for processing (1)
  • src/workflow/ParameterManager.py

@Sanjith1013
Copy link
Author

Hi @t0mdavid-m!
Even for the CI testing it needs the maintainers approval.
Let me know if it works.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/workflow/ParameterManager.py (2)

6-6: Consider using defusedxml for defense in depth.

While the .ini files are generated locally by TOPP tools and not from untrusted sources, using defusedxml is a best practice that protects against XML attacks (XXE, billion laughs, etc.) if the attack surface ever changes.

♻️ Suggested change
-import xml.etree.ElementTree as ET
+import defusedxml.ElementTree as ET

You'll need to add defusedxml to your dependencies.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/workflow/ParameterManager.py` at line 6, Replace the use of the standard
xml.etree.ElementTree parser in ParameterManager.py with the safe defusedxml
variant: import defusedxml.ElementTree as ET (or from defusedxml import
ElementTree as ET) and update any XML parsing calls that reference ET (e.g., any
parse()/fromstring() uses) to continue working with the defusedxml API; also add
defusedxml to the project dependencies (requirements / setup) so the package is
installed.

342-345: Log exceptions instead of silently swallowing them.

The broad except Exception: pass hides parsing errors that could help diagnose issues. At minimum, consider logging the exception for debugging purposes.

♻️ Suggested change
-        except Exception:
-            pass # Safely return empty list if XML parsing fails
-
-        return []
+        except (ET.ParseError, OSError) as exc:
+            # Log for debugging but don't crash the workflow
+            st.warning(f"Could not parse boolean parameters for {tool}: {exc}")
+            return []

This narrows the exception types and provides visibility into failures without disrupting the workflow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/workflow/ParameterManager.py` around lines 342 - 345, The broad "except
Exception: pass" in ParameterManager.py should be replaced with targeted
exception handling and logging: catch specific XML parsing errors (e.g.,
xml.etree.ElementTree.ParseError or the specific parser exception used) instead
of Exception, log the exception details using the module/class logger (use the
existing logger or create one with logging.getLogger(__name__)), and still
return an empty list on failure; update the except block around the XML parsing
code (the block currently containing "except Exception: pass") to log a clear
message with the exception information and narrow the exception type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 332-340: The boolean key-cleanup in ParameterManager.py currently
only strips the tool prefix (tool + ":"), leaving an instance prefix like "1:"
(e.g., "1:algorithm:...") which mismatches JSON keys; update the cleaned_params
computation in the function that processes bool_params (the list comprehension
that builds cleaned_params) to remove the combined prefix f"{tool}:1:" (and fall
back to removing f"{tool}:") so resulting keys match get_parameters_from_ini()
and CommandExecutor checks; also in the surrounding try/except (the exception
handler catching XML parsing errors) replace the silent pass with a logged error
(e.g., logger.error or print) that includes the tool name and exception message
to aid debugging.

---

Nitpick comments:
In `@src/workflow/ParameterManager.py`:
- Line 6: Replace the use of the standard xml.etree.ElementTree parser in
ParameterManager.py with the safe defusedxml variant: import
defusedxml.ElementTree as ET (or from defusedxml import ElementTree as ET) and
update any XML parsing calls that reference ET (e.g., any parse()/fromstring()
uses) to continue working with the defusedxml API; also add defusedxml to the
project dependencies (requirements / setup) so the package is installed.
- Around line 342-345: The broad "except Exception: pass" in ParameterManager.py
should be replaced with targeted exception handling and logging: catch specific
XML parsing errors (e.g., xml.etree.ElementTree.ParseError or the specific
parser exception used) instead of Exception, log the exception details using the
module/class logger (use the existing logger or create one with
logging.getLogger(__name__)), and still return an empty list on failure; update
the except block around the XML parsing code (the block currently containing
"except Exception: pass") to log a clear message with the exception information
and narrow the exception type.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9f792d1a-417b-4aaf-ab41-3ebb408390a4

📥 Commits

Reviewing files that changed from the base of the PR and between 8cd956c and 255a2e9.

📒 Files selected for processing (1)
  • src/workflow/ParameterManager.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/workflow/ParameterManager.py (1)

305-307: ⚠️ Potential issue | 🟠 Major

Don't return [] when boolean metadata lookup fails.

If .ini generation fails or XML parsing errors out, this returns the same value as “the tool has no boolean params”. CommandExecutor.run_topp() then falls back to explicit boolean routing and can emit -flag false again, recreating issue #90 for first-run or malformed-ini cases. Please signal this failure distinctly and let the caller handle it explicitly; if you keep recovery here, narrow the catch to parse/IO failures instead of Exception.

Also applies to: 342-345

🧹 Nitpick comments (1)
src/workflow/ParameterManager.py (1)

6-6: Consider hardening XML parsing if untrusted .ini files could reach this directory.

While ET.parse() is safe for parsing generated OpenMS .ini files (created via subprocess on line 47), xml.etree.ElementTree is vulnerable to billion laughs and XXE attacks if a crafted .ini file were placed in workflow_dir/ini/. If this is a concern for your use case, defusedxml.ElementTree is the safer default—however, this would require adding defusedxml as a new dependency, so evaluate the threat model first.

Also applies to: 313-314

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/workflow/ParameterManager.py` at line 6, The XML parsing uses
xml.etree.ElementTree (imported as ET) which is vulnerable to
XXE/billion‑laughs; replace it with defusedxml.ElementTree (install defusedxml)
or otherwise use defusedxml.ElementTree.parse /
defusedxml.ElementTree.fromstring wherever ET.parse or ET.fromstring are called
(search for ET.parse and ET.fromstring in ParameterManager.py, including the
usage around the ini file creation call and the locations referenced near the
end of the file) so untrusted .ini files cannot trigger XML entity attacks;
update the import line to use defusedxml.ElementTree as ET and adjust any ET
usage accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/workflow/ParameterManager.py`:
- Line 6: The XML parsing uses xml.etree.ElementTree (imported as ET) which is
vulnerable to XXE/billion‑laughs; replace it with defusedxml.ElementTree
(install defusedxml) or otherwise use defusedxml.ElementTree.parse /
defusedxml.ElementTree.fromstring wherever ET.parse or ET.fromstring are called
(search for ET.parse and ET.fromstring in ParameterManager.py, including the
usage around the ini file creation call and the locations referenced near the
end of the file) so untrusted .ini files cannot trigger XML entity attacks;
update the import line to use defusedxml.ElementTree as ET and adjust any ET
usage accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6e2ab6b7-8be8-47f5-9e14-4210ad10740c

📥 Commits

Reviewing files that changed from the base of the PR and between 255a2e9 and f681655.

📒 Files selected for processing (1)
  • src/workflow/ParameterManager.py

@t0mdavid-m
Copy link
Member

Thank you for your contributions. Could you please add some tests where you read from a ini that contains all parameter types and make sure they parse correctly?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 350-354: The except block catching ET.ParseError in
ParameterManager.py references an undefined logger and contains unreachable code
after a raise; replace the logger.error(...) call with st.error(...) to report
the XML parse error (streamlit is already imported), remove the unreachable
print(...) and pass lines, and ensure the exception is re-raised so callers can
handle it; update the except block that wraps ET.ParseError in the
function/method handling boolean parameter parsing accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0b64f424-8657-48c7-a7d6-3624d4a4e715

📥 Commits

Reviewing files that changed from the base of the PR and between f681655 and 3d48c57.

📒 Files selected for processing (2)
  • requirements.txt
  • src/workflow/ParameterManager.py

@Sanjith1013
Copy link
Author

Hi @t0mdavid-m,

Thank you for the feedback! Yes, I have added a comprehensive test (test_ParameterManager.py) that mocks an .ini file containing all parameter types (strings, ints, floats, booleans, and lists), including nested nodes. The test successfully verifies that get_boolean_parameters() strictly isolates and returns only the boolean parameters, ignoring the rest, while perfectly maintaining the expected 1:TestTool:algorithm:nested_bool path formatting.

I have also addressed all of the CodeRabbit AI review comments in the latest commit:

Security: Migrated the XML parser from standard xml.etree to defusedxml to prevent XXE (billion-laughs) vulnerabilities. Added defusedxml to requirements.txt.

Error Handling: Removed the silent return [] fallbacks. The get_boolean_parameters() method now raises explicit FileNotFoundError, RuntimeError, or ET.ParseError exceptions with detailed logging if the TOPP tool's .ini file fails to generate or parse.

Documentation: Added comprehensive docstrings to satisfy the 80% coverage threshold.

@Sanjith1013
Copy link
Author

Sanjith1013 commented Mar 15, 2026

Hi @t0mdavid-m,
Let me know if everything is good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Boolean Flags for TOPP tools are not working

2 participants