ip,ip6: add rate limiting for ICMP errors, ARP and ICMP input by rjarry · Pull Request #576 · DPDK/grout

rjarry · 2026-03-26T20:28:50Z

Under a flood of bad packets (TTL=1, no-route, etc.), every packet triggers expensive ICMP error construction: address lookups, header prepend with potential memmove, and checksum computation. Similarly, ARP and ICMP/ICMPv6 input processing can saturate the control plane thread when flooded. None of this is rate limited today.

Add a per-node token bucket to all ICMP error nodes, ARP input, and ICMP/ICMPv6 input nodes. The bucket refills once per second and the check runs before any per-packet work. When the bucket is empty, the whole vector is moved to a dedicated drop node via rte_node_next_stream_move(), avoiding any unnecessary processing.

Three separate rates are configurable via the graph config API: icmp-error-rate, arp-rate and icmp-rate. All default to 1000 packets/sec per node per worker. Setting a rate to 0 disables the limit. Rate changes take effect immediately without graph reload.

Rate limiting for ICMP errors, ARP, and ICMP/ICMPv6 input

Behavior

Per-node token-bucket limiter applied at node entry for ARP input, ICMP input (IPv4), ICMPv6 input, and IP/IPv6 ICMP error nodes.
Limiter check runs before any per-packet work; when rate_limited(...) returns true the node moves the entire incoming vector with rte_node_next_stream_move(graph, node, RATE_LIMITED) and returns nb_objs, skipping address lookups, header prepends/memmoves, checksum computation, and enqueueing.
Token refill is TSC-based and performed once per second (rte_rdtsc()/rte_get_tsc_hz()).
A max_rate of 0 disables limiting.
When limited, streams are routed to the shared drop node "error_rate_limited" (GR_DROP_REGISTER(error_rate_limited) in ip_error.c).

Implementation

Added struct rate_limit_ctx { uint16_t tokens; uint64_t last_refill; } and static inline bool rate_limited(struct rate_limit_ctx *ctx, const uint16_t max_rate, const uint16_t nb_pkts) in modules/infra/datapath/datapath.h.
- rate_limited(): asserts ctx, returns false if max_rate==0, reads rte_rdtsc(); if now - last_refill >= rte_get_tsc_hz() then tokens = max_rate and last_refill = now; if tokens == 0 returns true; otherwise decrements tokens by min(tokens, nb_pkts) and returns false.
Each affected node stores a node-local context containing struct rate_limit_ctx (GR_NODE_CTX_TYPE(...)) and registers an .init hook that initializes ctx->limit.tokens = graph_conf.*_rate and ctx->limit.last_refill = rte_rdtsc().
Nodes updated to perform the early check and add RATE_LIMITED next-node mapped to "error_rate_limited":
- modules/ip/datapath/arp_input.c
- modules/ip/datapath/icmp_input.c
- modules/ip6/datapath/icmp6_input.c
- modules/ip/datapath/ip_error.c (ttl_exceeded, no_route, frag_needed initializers)
- modules/ip6/datapath/ip6_error.c (ttl_exceeded, no_route initializers)
"error_rate_limited" drop node is registered via GR_DROP_REGISTER(error_rate_limited) and referenced by RATE_LIMITED mappings.

Control-plane and configuration

modules/infra/api/gr_infra.h:
- struct gr_graph_conf extended with uint16_t icmp_error_rate, arp_rate, icmp_rate (comments indicate defaults 1000, 0 disables).
- Added gr_graph_conf_set_attr_t bitmask flags including GR_GRAPH_SET_ICMP_ERROR, GR_GRAPH_SET_ARP, GR_GRAPH_SET_ICMP; existing RX_BURST/VECTOR flags preserved.
- New request type struct gr_graph_conf_set_req embeds struct gr_graph_conf plus set_attrs bitmask for selective updates.
modules/infra/control/graph.c / graph.h:
- graph_conf promoted to an external global (extern struct gr_graph_conf graph_conf) so datapath reads rates directly.
- graph_conf defaults set: icmp_error_rate = 1000, arp_rate = 1000, icmp_rate = 1000.
- graph_conf_set() changed to accept gr_graph_conf_set_req and apply only fields indicated by set_attrs; rx_burst/vector validated per-field; rate fields assigned directly when set_attrs bits present.
- The early "no-change" reload decision excludes rate fields so updating rates does not trigger the graph reload path; rate changes take effect immediately without reloading the graph.

CLI

modules/infra/cli/graph.c:
- graph config set uses gr_graph_conf_set_req and accepts optional ICMP_ERROR_RATE, ARP_RATE, ICMP_RATE parameters; corresponding set_attrs bits set only when parameters parse successfully.
- graph config show prints icmp_error_rate, arp_rate, icmp_rate.

Precise semantics from code

Token budget resets to max_rate once per second (using rte_get_tsc_hz()).
The limiter decrements by the vector size (nb_objs) in one operation using min(tokens, nb_objs).
Rate limits are configured per node and per worker via graph_conf.*_rate fields; setting a rate to 0 disables that limiter.

coderabbitai · 2026-03-26T20:38:12Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds per-path rate limiting and selective graph config updates. Introduces a gr_graph_conf_set_attr_t bitmask and gr_graph_conf_set_req to set new gr_graph_conf fields icmp_error_rate, arp_rate, and icmp_rate. Exposes a global graph_conf. Adds struct rate_limit_ctx and an inline rate_limited() helper. ARP, ICMP, ICMPv6, IP error, and IPv6 error nodes gain per-node rate-limit state, init hooks, early-rate checks that route to a new RATE_LIMITED edge targeting error_rate_limited. CLI and control plumbing updated for selective config setting.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/infra/api/gr_infra.h`:
- Around line 391-393: The three rate fields (icmp_error_rate, arp_rate,
icmp_rate) in struct gr_graph_conf cannot use 0 both as “disable” and
“unchanged” because GR_GRAPH_CONF_SET treats zeroed members as “unchanged”;
modify the API to add explicit presence flags (e.g., a bitmask or bools like
has_icmp_error_rate/has_arp_rate/has_icmp_rate) to the struct and update the
setter/serialization logic (the code path used by GR_GRAPH_CONF_SET and callers
such as modules/infra/cli/graph.c) to consult these presence bits when encoding
requests so a caller can send has_*=true with value 0 to mean “disable” while
has_*=false means “leave unchanged.”

In `@modules/infra/datapath/gr_datapath.h`:
- Around line 18-31: The rate limiter currently allows a whole vector when
tokens<nb_pkts and doesn't apply a runtime decrease of max_rate; in rate_limited
(and struct rate_limit_ctx usage) change the logic so: if max_rate == 0 return
true (reject all); on refill (now - last_refill >= rte_get_tsc_hz()) set
ctx->tokens = max_rate and update last_refill; otherwise clamp ctx->tokens =
RTE_MIN(ctx->tokens, max_rate) to handle runtime decreases; then if ctx->tokens
< nb_pkts return true (reject the batch) else subtract nb_pkts from ctx->tokens
(ctx->tokens -= nb_pkts); this ensures batches larger than remaining tokens are
rejected and lowers-of-rate take effect immediately.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6713d6cb-850c-47c1-a541-4e2c5fcf04e0

📥 Commits

Reviewing files that changed from the base of the PR and between a5520bf and d4bf7df.

📒 Files selected for processing (10)

modules/infra/api/gr_infra.h
modules/infra/cli/graph.c
modules/infra/control/gr_graph.h
modules/infra/control/graph.c
modules/infra/datapath/gr_datapath.h
modules/ip/datapath/arp_input.c
modules/ip/datapath/icmp_input.c
modules/ip/datapath/ip_error.c
modules/ip6/datapath/icmp6_input.c
modules/ip6/datapath/ip6_error.c

modules/infra/api/gr_infra.h

modules/infra/datapath/datapath.h

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/infra/control/graph.h`:
- Line 40: The global mutable declaration extern struct gr_graph_conf graph_conf
exposes graph_conf (type gr_graph_conf) to concurrent reads of graph_conf.*_rate
from datapath workers while modules/infra/control/graph.c mutates it, causing
data races; fix by removing the shared mutable global and either (a) change the
*_rate fields to atomic types and update them via atomic_store/atomic_load, or
(b) stop in-place updates and publish an immutable snapshot (allocate a new
gr_graph_conf, populate it in graph.c, then publish via an atomic pointer swap
and provide an accessor like gr_get_graph_conf_snapshot()) so datapath readers
use the snapshot or atomic loads instead of reading a mutable graph_conf
directly.

In `@modules/infra/datapath/datapath.h`:
- Around line 22-31: The limiter currently ignores upward changes to max_rate
because the code returns early when max_rate==0 and only refills tokens on a
periodic timer; update the logic so that when max_rate increases (including 0 ->
N) you immediately reseed the token bucket by setting ctx->tokens = max_rate and
ctx->last_refill = now before packet-drop decisions. Concretely, detect an
upward change by comparing the new max_rate to the previously configured value
stored in the ctx (e.g., ctx->max_rate or a saved prev_rate), perform the
immediate reseed, then update ctx->max_rate to the new value; keep the periodic
refill logic for normal operation and preserve the existing behavior when
max_rate is decreased or set to 0.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3cb6146b-de9e-432e-944b-a9df65b62a27

📥 Commits

Reviewing files that changed from the base of the PR and between 0afe4da and 5830640.

📒 Files selected for processing (10)

modules/infra/api/gr_infra.h
modules/infra/cli/graph.c
modules/infra/control/graph.c
modules/infra/control/graph.h
modules/infra/datapath/datapath.h
modules/ip/datapath/arp_input.c
modules/ip/datapath/icmp_input.c
modules/ip/datapath/ip_error.c
modules/ip6/datapath/icmp6_input.c
modules/ip6/datapath/ip6_error.c

🚧 Files skipped from review as they are similar to previous changes (3)

modules/ip/datapath/icmp_input.c
modules/infra/control/graph.c
modules/infra/api/gr_infra.h

coderabbitai · 2026-04-04T11:56:26Z

modules/infra/control/graph.h


 rte_edge_t gr_node_attach_parent(const char *parent, const char *node);

+extern struct gr_graph_conf graph_conf;


⚠️ Potential issue | 🔴 Critical

graph_conf is now a live cross-thread data race.

modules/infra/control/graph.c updates this struct in place, while the new datapath fast paths read graph_conf.*_rate on every vector. Exporting it as a plain global here makes those control-plane writes race with worker-lcore reads. Please publish the rate fields through atomics or swap a read-only snapshot instead of sharing a mutable struct.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modules/infra/control/graph.h` at line 40, The global mutable declaration extern struct gr_graph_conf graph_conf exposes graph_conf (type gr_graph_conf) to concurrent reads of graph_conf.*_rate from datapath workers while modules/infra/control/graph.c mutates it, causing data races; fix by removing the shared mutable global and either (a) change the *_rate fields to atomic types and update them via atomic_store/atomic_load, or (b) stop in-place updates and publish an immutable snapshot (allocate a new gr_graph_conf, populate it in graph.c, then publish via an atomic pointer swap and provide an accessor like gr_get_graph_conf_snapshot()) so datapath readers use the snapshot or atomic loads instead of reading a mutable graph_conf directly.

coderabbitai · 2026-04-04T11:56:26Z

modules/infra/datapath/datapath.h

+	if (max_rate == 0)
+		return false;
+
+	if (now - ctx->last_refill >= rte_get_tsc_hz()) {
+		ctx->tokens = max_rate;
+		ctx->last_refill = now;
+	} else if (ctx->tokens == 0) {
+		return true;
+	}
+	ctx->tokens -= RTE_MIN(ctx->tokens, nb_pkts);


⚠️ Potential issue | 🟠 Major

Upward rate changes are not applied immediately.

Line 22 leaves the bucket state untouched when the limiter is disabled, and Line 25 is the only place that replenishes tokens. If a node has already drained its bucket and the operator later increases max_rate — especially 0 -> N — Line 28 will keep dropping whole vectors until the stale 1-second refill window expires. That contradicts the stated “take effect immediately” behavior and can blackhole traffic right after reconfiguration. Please reseed the bucket on upward changes instead of waiting for the next periodic refill.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modules/infra/datapath/datapath.h` around lines 22 - 31, The limiter currently ignores upward changes to max_rate because the code returns early when max_rate==0 and only refills tokens on a periodic timer; update the logic so that when max_rate increases (including 0 -> N) you immediately reseed the token bucket by setting ctx->tokens = max_rate and ctx->last_refill = now before packet-drop decisions. Concretely, detect an upward change by comparing the new max_rate to the previously configured value stored in the ctx (e.g., ctx->max_rate or a saved prev_rate), perform the immediate reseed, then update ctx->max_rate to the new value; keep the periodic refill logic for normal operation and preserve the existing behavior when max_rate is decreased or set to 0.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/infra/api/gr_infra.h`:
- Around line 388-394: The enum gr_graph_conf_set_attr_t is declared as uint16_t
but only GR_GRAPH_SET_RX_BURST uses GR_BIT16 while the other flags use GR_BIT64;
change GR_BIT64 to GR_BIT16 for GR_GRAPH_SET_VECTOR, GR_GRAPH_SET_ICMP_ERROR,
GR_GRAPH_SET_ARP and GR_GRAPH_SET_ICMP so all bit macros match the uint16_t base
type and preserve unique bit positions within the 16-bit range.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9350d67f-f12f-46e1-9fee-922dd53ea5e8

📥 Commits

Reviewing files that changed from the base of the PR and between 5830640 and 7ffb7a3.

📒 Files selected for processing (10)

modules/infra/api/gr_infra.h
modules/infra/cli/graph.c
modules/infra/control/graph.c
modules/infra/control/graph.h
modules/infra/datapath/datapath.h
modules/ip/datapath/arp_input.c
modules/ip/datapath/icmp_input.c
modules/ip/datapath/ip_error.c
modules/ip6/datapath/icmp6_input.c
modules/ip6/datapath/ip6_error.c

🚧 Files skipped from review as they are similar to previous changes (3)

modules/infra/control/graph.h
modules/infra/datapath/datapath.h
modules/infra/control/graph.c

modules/infra/api/gr_infra.h

The graph config API uses zero to mean "don't change this field" for rx_burst_max and vector_max. Future fields may need zero as a valid value, so the convention does not scale. Add a set_attrs bitmask to the config set request so that the handler knows which fields the client explicitly provided. Move validation (zero and overflow checks) inside the per-field blocks so they only apply to fields being changed. Signed-off-by: Robin Jarry <rjarry@redhat.com>

Under a flood of bad packets (TTL=1, no-route, etc.), every packet triggers expensive ICMP error construction: address lookups, header prepend with potential memmove, and checksum computation. Similarly, ARP and ICMP/ICMPv6 input processing can saturate the control plane thread when flooded. None of this is rate limited today. Add a per-node token bucket to all ICMP error nodes, ARP input, and ICMP/ICMPv6 input nodes. The bucket refills once per second and the check runs before any per-packet work. When the bucket is empty, the whole vector is moved to a dedicated drop node via rte_node_next_stream_move(), avoiding any unnecessary processing. Three separate rates are configurable via the graph config API: icmp-error-rate, arp-rate and icmp-rate. All default to 1000 packets/sec per node per worker. Setting a rate to 0 disables the limit. Rate changes take effect immediately without graph reload. Signed-off-by: Robin Jarry <rjarry@redhat.com>

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

modules/infra/api/gr_infra.h Show resolved Hide resolved

modules/infra/datapath/datapath.h Show resolved Hide resolved

rjarry force-pushed the rate-limiting branch from d4bf7df to 0afe4da Compare March 27, 2026 09:37

rjarry force-pushed the rate-limiting branch from 0afe4da to 5830640 Compare April 4, 2026 11:45

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

rjarry force-pushed the rate-limiting branch from 5830640 to 7ffb7a3 Compare April 5, 2026 09:53

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

modules/infra/api/gr_infra.h Show resolved Hide resolved

rjarry force-pushed the rate-limiting branch from 7ffb7a3 to ac4daa0 Compare April 5, 2026 21:16

rjarry added 2 commits April 7, 2026 15:54

rjarry force-pushed the rate-limiting branch from ac4daa0 to a70f4ff Compare April 7, 2026 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ip,ip6: add rate limiting for ICMP errors, ARP and ICMP input#576

ip,ip6: add rate limiting for ICMP errors, ARP and ICMP input#576
rjarry wants to merge 2 commits intoDPDK:mainfrom
rjarry:rate-limiting

rjarry commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

Reviews paused

Walkthrough

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		rte_edge_t gr_node_attach_parent(const char parent, const char node);

		extern struct gr_graph_conf graph_conf;

Conversation

rjarry commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limiting for ICMP errors, ARP, and ICMP/ICMPv6 input

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rjarry commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading