Skip to content

Claude/review esp32 thread code es kdn#45

Merged
GuvHas merged 2 commits intomainfrom
claude/review-esp32-thread-code-EsKdn
Mar 20, 2026
Merged

Claude/review esp32 thread code es kdn#45
GuvHas merged 2 commits intomainfrom
claude/review-esp32-thread-code-EsKdn

Conversation

@GuvHas
Copy link
Copy Markdown
Owner

@GuvHas GuvHas commented Mar 20, 2026

No description provided.

claude added 2 commits March 20, 2026 07:27
…dNoc

During commissioning the controller's "Reconnect" step (step 18) needs to
resolve the device's NEW operational address via mDNS before it can open a
CASE session.  The mDNS record is sourced from the OTBR's SRP proxy.

Root cause: trySrpServiceAdd() ran when Thread first attached (via the
onThreadStateChanged callback) and built the SRP instance name from the
FIRST fabric in the table at that time (either an old committed fabric from
a previous session, or nothing if no fabric existed yet).  When addNoc
subsequently committed a new fabric with a different CompressedFabricId /
NodeId, the OTBR SRP record still carried the old instance name.  DNS-SD
resolution for the new <CFID>-<NodeId>._matter._tcp name timed out (~35 s)
and commissioning failed with "something went wrong".

Fix: add a FabricTable::Delegate (SrpFabricDelegate) that overrides
OnFabricCommitted.  It fires synchronously on the CHIP/OpenThread task
immediately after addNoc persists the new fabric, giving enough time to
update the SRP record before the CASE reconnect DNS-SD lookup is issued.

The delegate:
  1. Clears any stale SRP service (otSrpClientClearHostAndServices — avoids
     the OnSrpClientNotification crash fixed in the previous commit).
  2. Calls trySrpServiceAdd() with the specific newly committed FabricIndex
     so FindFabricWithIndex() picks the right fabric even when an older
     fabric still occupies index 1.

trySrpServiceAdd() gains an optional preferIndex parameter for this path;
the onThreadStateChanged / startup paths continue to use the first-fabric
fallback as before.

The delegate is registered in the startup ScheduleWork after chip::Server
is initialised.

https://claude.ai/code/session_01SxqAiQApiRTwXefk9aQ6bR
otSrpClientClearHostAndServices() resets the SRP client's host state
(hostname pointer → null, auto-host-address mode disabled) in addition
to removing services.  The previous commit called it in
SrpFabricDelegate::OnFabricCommitted but then only called
trySrpServiceAdd(), which goes straight to otSrpClientAddService().
Without a configured host, the SRP client rejects AddService or silently
fails to send an update, so the new <CFID>-<NodeId>._matter._tcp record
never reaches OTBR and CASE reconnect still times out.

Fix:
- Extract setupSrpHost(otInstance*): sets hostname and re-enables
  auto-host-address mode using the buffer in s_srp.
- Move the hostname buffer from a lambda-local static into SrpCtx
  (s_srp.hostname) so setupSrpHost() and the delegate share one buffer.
- Call setupSrpHost() in OnFabricCommitted immediately after Clear and
  before trySrpServiceAdd().
- Startup lambda uses setupSrpHost() too, eliminating the duplicate
  snprintf/SetHostName/EnableAutoHostAddress block.

https://claude.ai/code/session_01SxqAiQApiRTwXefk9aQ6bR
@GuvHas GuvHas merged commit 39837a0 into main Mar 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants