fix: subnet bootstrapping by phutchins · Pull Request #1545 · consensus-shipyard/ipc

phutchins · 2026-03-11T14:03:42Z

Note

Medium Risk
Touches the subnet initialization/bootstrapping flow and remote execution paths (SSH/sudo, config generation, genesis creation), so mistakes could prevent validators from starting or misconfigure networking, but changes are contained to scripting tooling.

Overview
Adds a bootstrap command to provision fresh remote validator hosts (deps + repo clone/build) and updates docs to guide a bootstrap-first workflow.

Reworks init to support --resume, separates local-vs-remote filesystem concerns (local ~/.ipc as source of truth, then copies configs/genesis to remotes), and makes node startup more reliable by generating a per-node start script with required resolver/subnet env vars.

Improves operability and troubleshooting: new diagnose command, check --wait, better libp2p/peer addressing via internal_ip, more robust SSH helpers/keepalives and non-login execution (exec_on_host_simple), safer config edits (temp scripts to avoid quoting), dashboard UX tweaks, and a new resolver troubleshooting guide.

^{Written by Cursor Bugbot for commit 6c0e662. This will update automatically on new commits. Configure here.}

…figuration updates - Added a new bootstrap command to install dependencies (Rust, Foundry, Node.js) on fresh validator hosts. - Updated initialization process to support resuming from previous failures. - Modified subnet configuration with new validator IPs and registry addresses. - Improved health check and execution commands for better reliability. - Enhanced documentation to reflect new bootstrap steps and usage instructions.

…rd metrics - Introduced a new troubleshooting document for diagnosing issues with the IPLD Resolver not listening on port 26654. - Enhanced the dashboard script to initialize additional metrics for better monitoring, including block production rates and finality tracking. - Updated health check scripts to ensure proper environment variable handling and improve logging for resolver-related configurations.

- Introduced a new function `ssh_exec_long` to handle long-running commands with streaming output, preventing SSH timeouts during builds. - Updated the `update_validator_binaries` function to utilize the new long-running command execution, improving build process logging and error handling.

scripts/ipc-subnet-manager/lib/dashboard.sh

…d script - Updated the calculation of `blocks_per_min` to accurately reflect block production rates based on time differences. - Adjusted timestamp formatting logic to ensure proper handling of time zone indicators.

cursor · 2026-03-11T14:39:04Z

scripts/ipc-subnet-manager/ipc-subnet-manager.sh

+    for arg in "$@"; do
+        case $arg in
+            --wait=*) wait_seconds="${arg#*=}" ;;
+            --wait) shift; wait_seconds="${1:-30}" ;;


--wait value not parsed in for loop

Medium Severity

In cmd_check, the --wait VALUE form (space-separated) calls shift inside a for arg in "$@" loop. shift modifies $@ but has no effect on the loop's already-captured iteration list. As a result, wait_seconds="${1:-30}" reads $1 — the original first argument (e.g., "--wait") — instead of the intended value (e.g., 45). Since "--wait" is not a number, the [ "$wait_seconds" -gt 0 ] check silently fails and no sleep occurs. The suggested usage ./ipc-manager check --wait 45 (advertised in the error output) will never work correctly.

scripts/ipc-subnet-manager/ipc-subnet-config.yml

- Modified the `ipc-subnet-config.yml` to reflect new registry and gateway addresses for the parent subnet. - Enhanced the `config.sh` script to retrieve parent addresses from a unified source, ensuring backward compatibility with existing configurations. - Updated YAML config synchronization logic to maintain consistency between subnet and ipc_cli.parent sections.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-11T15:31:57Z

scripts/ipc-subnet-manager/lib/dashboard.sh

+    METRICS[peers]=0
+    METRICS[mempool_size]=0
+    METRICS[mempool_bytes]=0
+    METRICS[mempool_max]=5000


Dashboard mempool_max initialization prevents config reading

Low Severity

METRICS[mempool_max] is now initialized to 5000 in initialize_dashboard, which causes the conditional check [ -z "${METRICS[mempool_max]:-}" ] in fetch_metrics to always evaluate to false. The actual CometBFT mempool config value is never read from the node. Previously this key was uninitialized, so the first fetch_metrics call would read the real value. The dashboard now always shows 5000 as capacity regardless of actual config, causing incorrect mempool percentage calculations.

Additional Locations (1)

scripts/ipc-subnet-manager/lib/dashboard.sh#L150-L154

cursor · 2026-03-11T15:31:57Z

scripts/ipc-subnet-manager/lib/health.sh

-                    local peer_ip=$(get_config_value "validators[$peer_idx].ip")
-                    if echo "$static_addrs" | grep -q "/ip4/$peer_ip/tcp/$libp2p_port"; then
+                    local peer_ip=$(get_peer_ip "$peer_idx")
+                    if echo "$static_addrs" | grep -q "/ip4/$peer_ip/tcp/$v_resolver_port"; then


Info command checks wrong port for peer static_addresses

Low Severity

In cmd_info, the static_addresses peer check uses $v_resolver_port (the current validator's resolver port) to verify peer entries. But static_addresses contains each peer's own resolver port, not the current validator's port. In local mode where each validator has a different port offset, this check always fails, producing misleading diagnostic output. The peer's port via get_resolver_port_for_validator "$peer_idx" is needed instead.

Additional Locations (1)

scripts/ipc-subnet-manager/lib/health.sh#L1080-L1082

phutchins added 3 commits March 5, 2026 11:57

phutchins requested a review from a team as a code owner March 11, 2026 14:03

Merge branch 'main' into feature/subnet-bootstrapping

f95b1fc

phutchins changed the title ~~Feature/subnet bootstrapping~~ fix: subnet bootstrapping Mar 11, 2026

cursor bot reviewed Mar 11, 2026

View reviewed changes

scripts/ipc-subnet-manager/lib/dashboard.sh Outdated Show resolved Hide resolved

scripts/ipc-subnet-manager/lib/dashboard.sh Outdated Show resolved Hide resolved

cursor bot reviewed Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: subnet bootstrapping#1545

fix: subnet bootstrapping#1545
phutchins wants to merge 6 commits intomainfrom
feature/subnet-bootstrapping

phutchins commented Mar 11, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

cursor bot Mar 11, 2026

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 11, 2026

Uh oh!

cursor bot Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

phutchins commented Mar 11, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Mar 11, 2026

Choose a reason for hiding this comment

--wait value not parsed in for loop

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 11, 2026

Choose a reason for hiding this comment

Dashboard mempool_max initialization prevents config reading

Uh oh!

cursor bot Mar 11, 2026

Choose a reason for hiding this comment

Info command checks wrong port for peer static_addresses

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

phutchins commented Mar 11, 2026 •

edited by cursor bot

Loading

`--wait` value not parsed in `for` loop