Skip to content

feat(telemetry): implement app-extended-heartbeat event#301

Merged
khanayan123 merged 17 commits intomainfrom
ayan.khan/add-extended-heartbeat
Apr 1, 2026
Merged

feat(telemetry): implement app-extended-heartbeat event#301
khanayan123 merged 17 commits intomainfrom
ayan.khan/add-extended-heartbeat

Conversation

@khanayan123
Copy link
Copy Markdown
Contributor

@khanayan123 khanayan123 commented Mar 31, 2026

Summary

Implement the `app-extended-heartbeat` telemetry event for the C++ tracer, scoped to the `configuration` payload only.

Motivation

Long-running services (24h+) currently only report their configuration state via the initial `app-started` event. If the backend misses or loses that event, there's no way to recover visibility into the SDK's configuration. The `app-extended-heartbeat` event solves this by re-sending the full configuration payload every 24h, ensuring reliable state reporting for long-running instances.

Per the spec, `integrations` and `dependencies` are optional — this PR scopes to `configuration` only, which is the focus of Config Visibility reliability.

Implementation

  • The event fires periodically (default 24h) and includes the full `configuration` payload
  • Configuration is tracked via `all_configurations_` — a map updated whenever `generate_configuration_field` is called — so the payload reflects runtime changes (e.g., remote config) not just static startup state
  • Seq-ids are read from the existing `config_seq_ids_` map without incrementing, so extended heartbeat does not corrupt sequence numbering
  • The interval is configurable via `DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL` (decimal seconds, default 86400) for system test parity validation

Changes

  • `include/datadog/environment.h`: Declare `DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL` env var (DECIMAL, default 86400.0)
  • `include/datadog/telemetry/configuration.h`: Add `extended_heartbeat_interval_seconds` (`Optional`) to `Configuration` and `extended_heartbeat_interval` to `FinalizedConfiguration`
  • `src/datadog/telemetry/configuration.cpp`: Parse env var and finalize interval with validation
  • `src/datadog/telemetry/telemetry_impl.h`: Add `all_configurations_` member; declare `extended_heartbeat_payload()`
  • `src/datadog/telemetry/telemetry_impl.cpp`: Populate `all_configurations_` in `generate_configuration_field`; schedule recurring extended heartbeat task; build payload from `all_configurations_` with current seq-ids (no increment)
  • `supported-configurations.json`: Updated supported configurations manifest
  • `test/telemetry/test_configuration.cpp`: Test for env var parsing and default value
  • `test/telemetry/test_telemetry.cpp`: Test that extended heartbeat includes configuration payload

Related

Add support for the app-extended-heartbeat telemetry event per the
telemetry v2 API spec. The event fires periodically (default 24h) and
includes the full configuration payload, matching app-started.

The interval is configurable via DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL
(integer seconds) to enable system testing with shorter intervals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 31, 2026

Benchmarks

Benchmark execution time: 2026-03-31 05:04:20

Comparing candidate commit 6766649 in PR branch ayan.khan/add-extended-heartbeat with baseline commit 910e3d5 in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 0 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:BM_TraceTinyCCSource

  • 🟥 execution_time [+3.079ms; +3.510ms] or [+4.050%; +4.618%]

khanayan123 and others added 2 commits March 31, 2026 00:19
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eartbeat default

Move extended heartbeat scheduling after metrics to preserve the
positional task order expected by FakeEventScheduler in tests
(heartbeat=0, metrics=1). Add default value check for
extended_heartbeat_interval in test_configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The FakeEventScheduler used positional indexing to identify callbacks,
which broke when the extended heartbeat task was added. Use interval
duration to distinguish metrics (<=60s) from extended heartbeat
(>60s) callbacks instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@datadog-prod-us1-6
Copy link
Copy Markdown

datadog-prod-us1-6 bot commented Mar 31, 2026

🎯 Code Coverage (details)
Patch Coverage: 91.89%
Overall Coverage: 90.89% (+0.06%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c888b84 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

…load

Add test that creates a telemetry instance with configuration, triggers
the extended heartbeat, and verifies the payload contains the expected
configuration entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
khanayan123 and others added 2 commits April 1, 2026 11:08
…anges

Add a test that simulates a remote config override after startup and
confirms the extended heartbeat reports the updated value and origin.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deduplicate the origin switch + error serialization logic shared by
generate_configuration_field and extended_heartbeat_payload into a
pure serialize_configuration_field(metadata, seq_id) helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@khanayan123 khanayan123 requested review from anna-git and dmehala April 1, 2026 15:14
Update DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL type from INT to DECIMAL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@anna-git anna-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adressing comments, nice work!

…erval

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…heartbeat interval

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@khanayan123 khanayan123 merged commit 1603dce into main Apr 1, 2026
39 checks passed
@khanayan123 khanayan123 deleted the ayan.khan/add-extended-heartbeat branch April 1, 2026 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants