feat(nat): TLS-derived identity + coordinator rotation (stacks on #66)#67
feat(nat): TLS-derived identity + coordinator rotation (stacks on #66)#67grumbach wants to merge 2 commits intoWithAutonomi:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR primarily adds/updates internal planning and review documentation for saorsa-core and switches ant-node to use a workspace + path dependency on saorsa-core.
Changes:
- Add
saorsa-corearchitecture documentation plus a large set of.planning/review artifacts. - Introduce a workspace in the root
Cargo.tomland changesaorsa-corefrom a crates.io dependency to a path dependency. - Add repo tooling config files (
.clippy.toml,.gitignore, Cursor rules,actconfig) undercrates/saorsa-core/.
Reviewed changes
Copilot reviewed 75 out of 632 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/saorsa-core/ARCHITECTURE.md | New high-level architecture doc for saorsa-core. |
| crates/saorsa-core/.planning/** | Adds extensive generated review/planning documentation. |
| crates/saorsa-core/.gitignore | Adds crate-local gitignore rules (target, IDE, env, tool state). |
| crates/saorsa-core/.clippy.toml | Adds clippy config (unwrap/expect allowed in tests). |
| crates/saorsa-core/.actrc | Adds act runner image mappings. |
| crates/saorsa-core/.cursorules / .cursorrules | Points Cursor rules to CLAUDE.md. |
| Cargo.toml | Creates a workspace and switches saorsa-core dependency to a local path; bumps ant-node version. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [workspace] | ||
| members = [ | ||
| ".", | ||
| "crates/saorsa-core", | ||
| "crates/saorsa-transport", | ||
| "crates/saorsa-transport/saorsa-transport-workspace-hack", | ||
| ] | ||
| resolver = "2" |
There was a problem hiding this comment.
Including "." in workspace.members is typically redundant because the workspace root package is already a member by default when [package] is present. Consider removing it to reduce the chance of confusion and to align with common Cargo workspace layouts.
| ## Review Summary | ||
|
|
||
| ### File Analyzed | ||
| - **Path**: `/Users/davidirvine/Desktop/Devel/projects/saorsa-core/src/messaging/encoding.rs` |
There was a problem hiding this comment.
This document includes an absolute local filesystem path (/Users/...), which can leak contributor machine details and is not portable. Recommend replacing absolute paths with repo-relative paths (e.g., src/messaging/encoding.rs) throughout .planning/** docs before merging.
| - **Path**: `/Users/davidirvine/Desktop/Devel/projects/saorsa-core/src/messaging/encoding.rs` | |
| - **Path**: `src/messaging/encoding.rs` |
| # Core (provides EVERYTHING: networking, DHT, security, trust, storage) | ||
| saorsa-core = "0.22.0" | ||
| saorsa-core = { path = "crates/saorsa-core" } |
There was a problem hiding this comment.
The PR title/description focuses on NAT/TLS-derived identity, coordinator rotation, and transport-layer behavior, but the visible diffs are dominated by planning/review markdown plus a workspace/path-dependency switch. If those networking changes are in other commits/files not included here, it may be worth splitting this into a separate "workspace + docs" PR (or updating the PR description) so reviewers can evaluate the NAT/TLS/coordinator logic independently.
d48dbff to
fcbf264
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 55 out of 486 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| DhtMessage::Join { node_info, .. } => { | ||
| debug!(node = ?node_info.id, "DHT join request"); | ||
| Ok(DhtResponse::JoinAck { | ||
| routing_info: crate::dht::network_integration::RoutingInfo { | ||
| bootstrap_nodes: vec![], | ||
| network_size: 0, | ||
| protocol_version: 1, | ||
| }, | ||
| neighbors: vec![], | ||
| }) | ||
| } | ||
|
|
||
| DhtMessage::Leave { node_id, .. } => { | ||
| debug!(node = ?node_id, "DHT leave notification"); | ||
| Ok(DhtResponse::LeaveAck { confirmed: true }) | ||
| } |
There was a problem hiding this comment.
DhtMessage::Join and DhtMessage::Leave only have a single named field in dht/network_integration.rs (Join { node_info }, Leave { node_id }). Matching them with .. will not compile. Remove the .. from these patterns.
| debug!(target = ?target, count = nodes.len(), "DHT find_node completed"); | ||
| Ok(DhtResponse::FindNodeReply { | ||
| nodes, | ||
| distances: Vec::new(), | ||
| }) |
There was a problem hiding this comment.
FindNodeReply returns nodes but always sets distances to an empty vec. Since distances is part of the response type, consumers may assume it is either omitted or aligned 1:1 with nodes. If the protocol expects distances, populate distances with the computed values (or remove the field from the response type if it is no longer used).
| debug!(target = ?target, count = nodes.len(), "DHT find_node completed"); | |
| Ok(DhtResponse::FindNodeReply { | |
| nodes, | |
| distances: Vec::new(), | |
| }) | |
| let distances = nodes | |
| .iter() | |
| .map(|node| target.distance(&node.id)) | |
| .collect(); | |
| debug!(target = ?target, count = nodes.len(), "DHT find_node completed"); | |
| Ok(DhtResponse::FindNodeReply { nodes, distances }) |
| /// Maximum joins per /64 subnet per hour (default: 1) | ||
| /// This is the strictest limit to prevent Sybil attacks | ||
| pub max_joins_per_64_per_hour: u32, | ||
|
|
||
| /// Maximum joins per /48 subnet per hour (default: 5) | ||
| pub max_joins_per_48_per_hour: u32, | ||
|
|
||
| /// Maximum joins per /24 subnet per hour for IPv4 (default: 3) | ||
| pub max_joins_per_24_per_hour: u32, | ||
|
|
||
| /// Maximum global joins per minute (default: 100) | ||
| /// This protects against network-wide flooding | ||
| pub max_global_joins_per_minute: u32, | ||
|
|
||
| /// Burst allowance for global limit (default: 10) | ||
| pub global_burst_size: u32, |
There was a problem hiding this comment.
The doc comments specify conservative defaults (e.g., /64 default 1, global default 100/min), but Default currently sets everything to 10,000. This effectively disables Sybil-relevant protections by default and also conflicts with the PR description (rate limit 50 -> 300/min). Either (mandatory) set Default to match the documented defaults / intended operational values, or (optional) rename/comment clearly that these are 'development defaults' and provide a separate constructor for permissive settings.
| max_joins_per_64_per_hour: 10_000, | ||
| max_joins_per_48_per_hour: 10_000, | ||
| max_joins_per_24_per_hour: 10_000, | ||
| max_global_joins_per_minute: 10_000, | ||
| global_burst_size: 10_000, |
There was a problem hiding this comment.
The doc comments specify conservative defaults (e.g., /64 default 1, global default 100/min), but Default currently sets everything to 10,000. This effectively disables Sybil-relevant protections by default and also conflicts with the PR description (rate limit 50 -> 300/min). Either (mandatory) set Default to match the documented defaults / intended operational values, or (optional) rename/comment clearly that these are 'development defaults' and provide a separate constructor for permissive settings.
| max_joins_per_64_per_hour: 10_000, | |
| max_joins_per_48_per_hour: 10_000, | |
| max_joins_per_24_per_hour: 10_000, | |
| max_global_joins_per_minute: 10_000, | |
| global_burst_size: 10_000, | |
| max_joins_per_64_per_hour: 1, | |
| max_joins_per_48_per_hour: 5, | |
| max_joins_per_24_per_hour: 3, | |
| max_global_joins_per_minute: 100, | |
| global_burst_size: 10, |
| let elapsed = now.duration_since(self.last_update).as_secs_f64(); | ||
| let refill_rate = cfg.max_requests as f64 / cfg.window.as_secs_f64(); | ||
| self.tokens += elapsed * refill_rate; | ||
| self.tokens = self.tokens.min(cfg.burst_size as f64); |
There was a problem hiding this comment.
cfg.window.as_secs_f64() can be 0.0 if window is configured as zero, which would cause a division-by-zero (infinite refill rate) and break rate limiting. Add a guard/validation when constructing EngineConfig/Engine (or in try_consume) to ensure the window is non-zero, and define the behavior for zero explicitly (e.g., deny all or allow all).
| - `tracing` - Logging | ||
|
|
||
| ### Networking | ||
| - `saorsa-transport` (0.26) - QUIC transport with P2P NAT traversal |
There was a problem hiding this comment.
README version references are internally inconsistent and appear outdated relative to crates/saorsa-core/Cargo.toml (which is 0.22.0) and the PR description. Update the dependency snippet and saorsa-transport version references so users don't copy-paste incorrect versions.
| - `saorsa-transport` (0.26) - QUIC transport with P2P NAT traversal | |
| - `saorsa-transport` (0.22.0) - QUIC transport with P2P NAT traversal |
|
|
||
| This document provides a comprehensive overview of the security architecture, threat mitigations, and network protections implemented in Saorsa Core. | ||
|
|
||
| > Note: Attestation and witness subsystems are currently out of scope for saorsa-core. Any legacy references below are historical and will be revised. |
There was a problem hiding this comment.
The document states attestation/witness subsystems are out of scope, but then includes detailed, normative sections describing attestation challenges and witness attestation protocols. To avoid confusing implementers, either remove/relocate these sections, or clearly label them as historical/non-applicable with pointers to the owning layer (e.g., saorsa-node) if still relevant.
| > Note: Attestation and witness subsystems are currently out of scope for saorsa-core. Any legacy references below are historical and will be revised. | |
| > Scope note: Attestation and witness subsystems are out of scope for `saorsa-core`. Any references to attestation challenges, witness attestation, or related protocols that may appear below are retained only as historical/contextual material, are non-normative for `saorsa-core`, and MUST NOT be treated as implementation requirements for this crate. If those mechanisms remain relevant, their authoritative specification belongs in the owning layer (for example, `saorsa-node`) rather than this document. |
| Note: Application data storage and retrieval are handled in **saorsa-node**. saorsa-core | ||
| tracks availability outcomes via trust signals to downscore nodes that fail to serve | ||
| expected data. | ||
|
|
||
| ### Nonce-Based Attestation Challenges | ||
|
|
||
| Data integrity is verified through cryptographic attestation using the formula: | ||
|
|
||
| ``` | ||
| Response = BLAKE3(nonce || data) | ||
| ``` | ||
|
|
||
| **Security Properties:** | ||
| - **Nonce freshness**: Random 32-byte nonces prevent precomputation | ||
| - **Binding**: Response cryptographically bound to actual data | ||
| - **Efficiency**: BLAKE3 enables fast verification at scale | ||
|
|
||
| ### Challenge Protocol | ||
|
|
There was a problem hiding this comment.
The document states attestation/witness subsystems are out of scope, but then includes detailed, normative sections describing attestation challenges and witness attestation protocols. To avoid confusing implementers, either remove/relocate these sections, or clearly label them as historical/non-applicable with pointers to the owning layer (e.g., saorsa-node) if still relevant.
| Note: Application data storage and retrieval are handled in **saorsa-node**. saorsa-core | |
| tracks availability outcomes via trust signals to downscore nodes that fail to serve | |
| expected data. | |
| ### Nonce-Based Attestation Challenges | |
| Data integrity is verified through cryptographic attestation using the formula: | |
| ``` | |
| Response = BLAKE3(nonce || data) | |
| ``` | |
| **Security Properties:** | |
| - **Nonce freshness**: Random 32-byte nonces prevent precomputation | |
| - **Binding**: Response cryptographically bound to actual data | |
| - **Efficiency**: BLAKE3 enables fast verification at scale | |
| ### Challenge Protocol | |
| Note: Application data storage, retrieval, attestation, and witness coordination are handled | |
| in **saorsa-node**. saorsa-core only tracks availability outcomes via trust signals to | |
| downscore nodes that fail to serve expected data. | |
| ### Historical Note: Attestation Challenges | |
| The attestation material that previously appeared in this section is **not applicable to | |
| saorsa-core** and should not be implemented as part of the saorsa-core security model. | |
| Any current attestation challenge/response protocol, witness attestation flow, or data | |
| verification handshake belongs to **saorsa-node** (or the owning higher layer) and must be | |
| specified there. | |
| This document only defines how saorsa-core consumes resulting availability/trust signals; it | |
| does not define a nonce format, challenge protocol, or attestation formula for implementers. | |
| ### Challenge Protocol | |
| Historical/non-applicable in saorsa-core. See **saorsa-node** for the owning protocol | |
| specification if this behavior is still required. |
| ### Nonce-Based Attestation Challenges | ||
|
|
||
| Data integrity is verified through cryptographic attestation using the formula: | ||
|
|
||
| ``` | ||
| Response = BLAKE3(nonce || data) | ||
| ``` | ||
|
|
||
| **Security Properties:** | ||
| - **Nonce freshness**: Random 32-byte nonces prevent precomputation | ||
| - **Binding**: Response cryptographically bound to actual data | ||
| - **Efficiency**: BLAKE3 enables fast verification at scale | ||
|
|
||
| ### Challenge Protocol | ||
|
|
||
| ``` | ||
| 1. Challenger generates random nonce | ||
| 2. Challenger sends challenge(nonce, data_key) to holder | ||
| 3. Holder computes BLAKE3(nonce || stored_data) | ||
| 4. Holder returns signed response | ||
| 5. Challenger verifies response matches expected hash | ||
| ``` | ||
|
|
||
| ### Attestation Failure Handling | ||
|
|
There was a problem hiding this comment.
The document states attestation/witness subsystems are out of scope, but then includes detailed, normative sections describing attestation challenges and witness attestation protocols. To avoid confusing implementers, either remove/relocate these sections, or clearly label them as historical/non-applicable with pointers to the owning layer (e.g., saorsa-node) if still relevant.
| ### Nonce-Based Attestation Challenges | |
| Data integrity is verified through cryptographic attestation using the formula: | |
| ``` | |
| Response = BLAKE3(nonce || data) | |
| ``` | |
| **Security Properties:** | |
| - **Nonce freshness**: Random 32-byte nonces prevent precomputation | |
| - **Binding**: Response cryptographically bound to actual data | |
| - **Efficiency**: BLAKE3 enables fast verification at scale | |
| ### Challenge Protocol | |
| ``` | |
| 1. Challenger generates random nonce | |
| 2. Challenger sends challenge(nonce, data_key) to holder | |
| 3. Holder computes BLAKE3(nonce || stored_data) | |
| 4. Holder returns signed response | |
| 5. Challenger verifies response matches expected hash | |
| ``` | |
| ### Attestation Failure Handling | |
| ### Historical / Out-of-Scope Attestation Reference | |
| Attestation and witness challenge/response protocols are **not specified by saorsa-core**. | |
| Any earlier references to nonce-based attestation, witness verification, or signed | |
| storage challenges are retained here only as historical context and are **non-normative**. | |
| They must not be implemented from this document. | |
| If attestation remains relevant for application data storage or retrieval flows, the | |
| authoritative protocol definition belongs in **saorsa-node** (or the owning layer), not | |
| in this security model. saorsa-core only consumes resulting availability/trust outcomes | |
| where applicable. | |
| Historical examples such as `BLAKE3(nonce || data)`, challenge sequencing, or failure | |
| handling thresholds should be treated as legacy design notes rather than current | |
| requirements. | |
| ### Attestation Failure Handling (Historical, Non-Normative) | |
| The following material, if retained below, is provided for archival context only and | |
| does not define required saorsa-core behavior. |
| 5. Reject if threshold not met | ||
| ``` | ||
|
|
||
| ### Witness Attestation Protocol |
There was a problem hiding this comment.
The document states attestation/witness subsystems are out of scope, but then includes detailed, normative sections describing attestation challenges and witness attestation protocols. To avoid confusing implementers, either remove/relocate these sections, or clearly label them as historical/non-applicable with pointers to the owning layer (e.g., saorsa-node) if still relevant.
fad729b to
2feef18
Compare
Move saorsa-core and saorsa-transport into crates/ as workspace members. All dependencies now use path references instead of fragile git URLs, branch names, or commit hashes. Cleaned up ~100 non-source files: AI planning artifacts, saved test outputs, Go legacy files, orphaned scripts. 492 ant-node tests pass.
2feef18 to
c269748
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 57 out of 488 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| impl NodeIdentity { | ||
| /// Save identity to a JSON file (async) | ||
| pub async fn save_to_file(&self, path: &std::path::Path) -> Result<()> { | ||
| use tokio::fs; | ||
| let data = self.export(); | ||
| let json = serde_json::to_string_pretty(&data).map_err(|e| { | ||
| P2PError::Identity(crate::error::IdentityError::InvalidFormat( | ||
| format!("Failed to serialize identity: {}", e).into(), | ||
| )) | ||
| })?; |
There was a problem hiding this comment.
This persists the ML-DSA secret key in plaintext JSON. If save_to_file is reachable in production (not strictly test-only), it creates a high-impact key exfiltration risk (disk backups, logs, accidental commits, permissive file modes). Prefer one of: (1) gate persistence behind a test-only cfg/feature, (2) encrypt at rest (OS keyring / passphrase-derived key) and include a versioned format, and/or (3) enforce restrictive file permissions (e.g., 0o600 on Unix) and clearly document that the file contains sensitive material.
| #[derive(Serialize, Deserialize)] | ||
| pub struct IdentityData { | ||
| /// ML-DSA secret key bytes (4032 bytes for ML-DSA-65) | ||
| pub secret_key: Vec<u8>, | ||
| /// ML-DSA public key bytes (1952 bytes for ML-DSA-65) | ||
| pub public_key: Vec<u8>, | ||
| } |
There was a problem hiding this comment.
This persists the ML-DSA secret key in plaintext JSON. If save_to_file is reachable in production (not strictly test-only), it creates a high-impact key exfiltration risk (disk backups, logs, accidental commits, permissive file modes). Prefer one of: (1) gate persistence behind a test-only cfg/feature, (2) encrypt at rest (OS keyring / passphrase-derived key) and include a versioned format, and/or (3) enforce restrictive file permissions (e.g., 0o600 on Unix) and clearly document that the file contains sensitive material.
| pub fn export(&self) -> IdentityData { | ||
| IdentityData { | ||
| secret_key: self.secret_key.as_bytes().to_vec(), | ||
| public_key: self.public_key.as_bytes().to_vec(), | ||
| } | ||
| } |
There was a problem hiding this comment.
This persists the ML-DSA secret key in plaintext JSON. If save_to_file is reachable in production (not strictly test-only), it creates a high-impact key exfiltration risk (disk backups, logs, accidental commits, permissive file modes). Prefer one of: (1) gate persistence behind a test-only cfg/feature, (2) encrypt at rest (OS keyring / passphrase-derived key) and include a versioned format, and/or (3) enforce restrictive file permissions (e.g., 0o600 on Unix) and clearly document that the file contains sensitive material.
| impl Default for JoinRateLimiterConfig { | ||
| fn default() -> Self { | ||
| Self { | ||
| max_joins_per_64_per_hour: 10_000, | ||
| max_joins_per_48_per_hour: 10_000, | ||
| max_joins_per_24_per_hour: 10_000, | ||
| max_global_joins_per_minute: 10_000, | ||
| global_burst_size: 10_000, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The PR description calls out a join rate-limit change (50 -> 300/min), but the default config here effectively disables join limiting (10,000/min and large burst). If the intent is a higher-but-realistic limit, set defaults accordingly and keep the doc comments in-sync; otherwise, explicitly label these as placeholder/testing defaults to avoid shipping unsafe defaults.
| #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] | ||
| pub struct JoinRateLimiterConfig { | ||
| /// Maximum joins per /64 subnet per hour (default: 1) | ||
| /// This is the strictest limit to prevent Sybil attacks | ||
| pub max_joins_per_64_per_hour: u32, | ||
|
|
||
| /// Maximum joins per /48 subnet per hour (default: 5) | ||
| pub max_joins_per_48_per_hour: u32, | ||
|
|
||
| /// Maximum joins per /24 subnet per hour for IPv4 (default: 3) | ||
| pub max_joins_per_24_per_hour: u32, | ||
|
|
||
| /// Maximum global joins per minute (default: 100) | ||
| /// This protects against network-wide flooding | ||
| pub max_global_joins_per_minute: u32, | ||
|
|
||
| /// Burst allowance for global limit (default: 10) | ||
| pub global_burst_size: u32, | ||
| } |
There was a problem hiding this comment.
The doc comments state specific defaults (e.g., /64 default 1, global default 100/min), but Default for JoinRateLimiterConfig sets all values to 10,000. Please update either the doc comments to match reality or adjust the default implementation to match the documented defaults.
| /// Load identity from file | ||
| pub async fn load_from_file(path: &Path) -> Result<Self> { |
There was a problem hiding this comment.
These NodeIdentity persistence methods duplicate the save_to_file / load_from_file inherent methods already defined in identity/node_identity.rs. Even if this file is currently not in the module tree, it’s easy to accidentally introduce a hard compile error later when wiring it in. Consider either removing the duplicated methods, renaming them (e.g., save_to_file_for_tests), and/or gating the entire module behind #[cfg(test)] or a dedicated feature so only non-production consumers can enable it.
| /// Save to default location | ||
| pub async fn save_default(&self) -> Result<()> { |
There was a problem hiding this comment.
These NodeIdentity persistence methods duplicate the save_to_file / load_from_file inherent methods already defined in identity/node_identity.rs. Even if this file is currently not in the module tree, it’s easy to accidentally introduce a hard compile error later when wiring it in. Consider either removing the duplicated methods, renaming them (e.g., save_to_file_for_tests), and/or gating the entire module behind #[cfg(test)] or a dedicated feature so only non-production consumers can enable it.
| /// Load from default location | ||
| pub async fn load_default() -> Result<Self> { |
There was a problem hiding this comment.
These NodeIdentity persistence methods duplicate the save_to_file / load_from_file inherent methods already defined in identity/node_identity.rs. Even if this file is currently not in the module tree, it’s easy to accidentally introduce a hard compile error later when wiring it in. Consider either removing the duplicated methods, renaming them (e.g., save_to_file_for_tests), and/or gating the entire module behind #[cfg(test)] or a dedicated feature so only non-production consumers can enable it.
| // Re-export saorsa-transport PQC functions for convenience | ||
| pub use self::saorsa_transport_integration::{generate_ml_dsa_keypair, ml_dsa_sign, ml_dsa_verify}; | ||
|
|
||
| // Primary post-quantum cryptography types from saorsa-pqc 0.3.0 |
There was a problem hiding this comment.
The comment says these are from saorsa-pqc 0.3.0, but the crate manifest pins saorsa-pqc = \"0.5\". Please update this comment to avoid misleading version references (especially since crypto API differences between versions are high-impact).
| // Primary post-quantum cryptography types from saorsa-pqc 0.3.0 | |
| // Primary post-quantum cryptography types from saorsa-pqc |
| pub fn peer_id_from_public_key_bytes(bytes: &[u8]) -> Result<PeerId> { | ||
| if bytes.len() != ML_DSA_PUB_KEY_LEN { | ||
| return Err(P2PError::Identity(IdentityError::InvalidFormat( | ||
| "Invalid ML-DSA public key length".to_string().into(), | ||
| ))); | ||
| } | ||
|
|
||
| let public_key = MlDsaPublicKey::from_bytes(bytes).map_err(|e| { | ||
| IdentityError::InvalidFormat(format!("Invalid ML-DSA public key: {:?}", e).into()) | ||
| })?; | ||
|
|
||
| Ok(peer_id_from_public_key(&public_key)) | ||
| } |
There was a problem hiding this comment.
This introduces a new public parsing/validation helper with two distinct error paths (wrong length vs. invalid key bytes), but there are no unit tests covering those cases (or the success path) in this module. Adding focused tests for: (1) wrong length, (2) invalid-but-correct-length bytes, and (3) valid key bytes would prevent accidental behavior drift.
Stacked on #66 (workspace consolidation). Review #66 first.
Incorporates Mick's saorsa-core#75 and saorsa-transport#52:
492 ant-node tests pass.