Skip to content

release: v0.5.0 — Raft clustering, TLS, streaming snapshots#75

Merged
ApiliumDevTeam merged 15 commits intomainfrom
dev
Mar 13, 2026
Merged

release: v0.5.0 — Raft clustering, TLS, streaming snapshots#75
ApiliumDevTeam merged 15 commits intomainfrom
dev

Conversation

@ApiliumDevTeam
Copy link
Contributor

Summary

  • Raft consensus clustering with automatic leader election, log replication, and membership changes
  • Write-Ahead Log (WAL) for crash-safe durability with segment rotation and CRC32 checksums
  • Streaming snapshots with 512KB chunked transfer, per-chunk ACK, and blake3 integrity verification
  • TLS encryption for inter-node communication (self-signed or custom PEM certs)
  • CRDT conflict resolution (LWW registers + OR-Set) for concurrent writes
  • Quorum reads via X-Consistency: quorum header
  • Ineru memory replication (LTM via Raft, snapshot transfer)
  • Production hardening: constant-time secret auth, Raft shutdown timeout, config validation, join retry with exponential backoff, learner rollback, snapshot buffer TTL eviction
  • New crates: aingle_wal, aingle_raft
  • Cluster endpoints: status, members, join, leave, WAL stats/verify
  • README: clustering docs, architecture diagram update, Mayros AI badge, Rust 1.83

Test plan

  • cargo check --workspace
  • cargo test -p aingle_raft — 33/33
  • cargo test -p aingle_cortex --features cluster --lib — 144/144
  • cargo test -p aingle_cortex --features cluster --test cluster_integration_test — 3/3 (single-node, 3-node replication, WAL verify)
  • All workspace crate versions bumped to 0.5.0

ApiliumDevTeam and others added 15 commits March 11, 2026 22:44
New crate `aingle_wal` with segment-based WAL, hash chain integrity,
thread-safe writer, reader with replay/verification, and segment rotation.
WAL integrated into AppState and mutation paths (triples, memory) behind
`#[cfg(feature = "cluster")]`. All 20 WAL tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New crate `aingle_raft` with openraft TypeConfig, log store, state machine,
network layer, and consistency levels. Cluster REST endpoints added
(status, join, leave, members, WAL stats/verify). P2pMessage extended
with Raft + cluster variants. CLI flags for cluster mode. All 15 raft
tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ConsistencyLevel enum (Local/Quorum/Linearizable) with header parsing.
Read endpoints (get_triple, list_triples) now accept X-Consistency header
and route through appropriate consistency logic when cluster feature is
enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
LwwTriple (Last-Writer-Wins with deterministic tie-break by node_id)
and OrSet (Observed-Remove Set for triple existence) implemented in
aingle_graph behind `#[cfg(feature = "crdt")]`. Merge is commutative,
associative, and idempotent. All 9 CRDT tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sfer)

ClusterSnapshot with TripleSnapshot wire format for full state transfer.
STM explicitly excluded (node-local). HNSW index rebuilt locally from
replicated LTM. LTM WAL entry kinds (LtmEntityCreate, LtmLinkCreate,
LtmEntityDelete) already present from Phase 1. Snapshot serialization
roundtrip tested. All 18 raft tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bump all 10 product crates from 0.4.2 → 0.5.0 and update internal
dependency version ranges from "0.4" → "0.5" to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement RaftLogReader and RaftLogStorage for CortexLogStore with
WAL-backed persistence. Vote and committed state persisted to JSON files.
Recovery on restart reads WAL segments to rebuild the in-memory BTreeMap.
Add RaftEntry and Noop variants to WalEntryKind.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Connect CortexStateMachine to real GraphDB and IneruMemory so Raft-committed
mutations are applied: TripleInsert/Delete to graph, MemoryStore/Forget to
Ineru LTM. Add CortexSnapshotBuilder for full-state snapshots.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement RaftNetworkFactory and RaftNetworkV2 for CortexNetworkConnection.
Add RaftRpcSender trait to decouple from QUIC transport, enabling stub
senders during bootstrap and real P2P transport at runtime.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bootstrap openraft::Raft in main.rs with CortexLogStore, CortexStateMachine,
and CortexNetworkFactory. Add raft and cluster_node_id fields to AppState.
Single-node cluster auto-initializes when no peers are configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Route triple and memory writes through Raft in cluster mode. Add
ensure_linearizable guards to GET handlers honoring X-Consistency header
(linearizable via ReadIndex, quorum via LeaseRead, local passthrough).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rewrite cluster status/join/leave/members endpoints to use real Raft
metrics (role, term, leader, membership). Join adds learner then promotes
to voter; leave removes node from voter set via change_membership.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d Mayros badge

- Add Clustering section with 3-node quickstart, TLS, and endpoint reference
- Add Consensus Layer (Raft, WAL, Streaming Snapshots, TLS) to architecture diagram
- Add aingle_raft and aingle_wal to platform components table
- Add "Powers Mayros AI" badge linking to ApiliumCode/mayros
- Update Rust version badge and prerequisites from 1.70 to 1.83
- Add cluster build command to quickstart
Refactors cluster initialization into a dedicated, reusable module.
Implements robust HTTP-based Raft RPC with TLS encryption, shared secret
authentication, and exponential backoff for inter-node communication.

Adds automatic leader redirection (HTTP 307) for client requests and
cluster management operations to improve client routing and cluster availability.

Introduces chunked snapshot transfers with Blake3 integrity checksums
for efficient and reliable state replication, especially for large datasets.
Improves WAL durability by persisting purge and truncation boundaries.

Ensures data consistency by routing all write operations through Raft when
clustering is enabled, preventing direct writes and potential split-brain.

Includes comprehensive integration tests for cluster functionality.
feat: Raft clustering with TLS, streaming snapshots, and production hardening
@ApiliumDevTeam ApiliumDevTeam merged commit cce066e into main Mar 13, 2026
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant