Skip to content

Add operation normalizer v2#97

Merged
swalkinshaw merged 12 commits intomainfrom
add-operation-normalizer-v2
Mar 27, 2026
Merged

Add operation normalizer v2#97
swalkinshaw merged 12 commits intomainfrom
add-operation-normalizer-v2

Conversation

@swalkinshaw
Copy link
Copy Markdown
Contributor

@swalkinshaw swalkinshaw commented Mar 25, 2026

Successor to #92 which implements a more "aggressive" static normalization algorithm for the purpose of stable observability signatures.

The normalizer now produces canonical structural representations by:

  • expanding named fragments into inline fragments (merging spread + definition directives)
  • removing all aliases
  • dropping all argument values
  • flattening bare inline fragments (no type condition, no directives)
  • deep merging inline fragments with matching type condition and directive set
  • dropping variable definitions and operation names
  • sorting fields first (alphabetically), then inline fragments (by type condition, then directives)

note: the normalized operation document is not required to be valid GraphQL

Architecture changes from single-pass visit-and-write to a 3-phase pipeline: build IR → normalize → serialize. The IR holds borrowed references into the original AST with owned structural nesting.

Since the normalization design is a little complex, I've purposely included a lot of comments and inline examples within the code.

Performance: I've already done an /autoresearch pass on this and included the performance optimizations here. See 6843650 for details.

     Running benches/normalize.rs (/Users/scottwalkinshaw/dev/bluejay/target/release/deps/normalize-bb82272d4a5ae77b)

normalize_small         time:   [97.066 ns 97.297 ns 97.599 ns]

signature_small         time:   [212.54 ns 213.34 ns 214.14 ns]

normalize_medium        time:   [443.75 ns 445.53 ns 447.49 ns]

signature_medium        time:   [676.01 ns 677.86 ns 679.72 ns]

normalize_complex       time:   [1.9462 µs 1.9524 µs 1.9595 µs]

signature_complex       time:   [2.5373 µs 2.5479 µs 2.5625 µs]

normalize_fragment_colocation
                        time:   [2.9908 µs 3.0127 µs 3.0365 µs]

signature_fragment_colocation
                        time:   [4.0905 µs 4.1078 µs 4.1279 µs]

normalize_wide_reverse  time:   [1.0499 µs 1.0536 µs 1.0577 µs]

signature_wide_reverse  time:   [1.4475 µs 1.4502 µs 1.4542 µs]

Copy link
Copy Markdown
Contributor

@ravangen ravangen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Rust skills are nonexistent, but leaving some comments around GraphQL aspects

swalkinshaw and others added 6 commits March 25, 2026 17:29
Replace Apollo-style document signatures with a more aggressive
normalization algorithm for observability signatures. The normalizer
now produces canonical structural representations by:

- Expanding named fragments into inline fragments (merging spread +
  definition directives)
- Removing all aliases
- Replacing all argument values with a single placeholder ($\_)
- Flattening bare inline fragments (no type condition, no directives)
- Deep merging inline fragments with matching type condition and
  directive set
- Dropping variable definitions and operation names
- Sorting fields first (alphabetically), then inline fragments
  (by type condition, then directives)

Architecture changes from single-pass visit-and-write to a 3-phase
pipeline: build IR → normalize → serialize. The IR holds borrowed
references into the original AST with owned structural nesting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Key changes:
- Add bumpalo arena allocator for all IR allocations
- Replace HashMap/HashSet with Vec for fragment lookup and cycle detection
- Single-pass build+normalize (merge flatten/sort into build, remove normalize_ir.rs)
- Use swap_remove instead of remove in inline fragment merge (O(1) vs O(n))
- Skip re-sort of merged IF children when no merge occurred
- Replace write! macro with direct push_str in serializer
Add crate-level doc comment to lib.rs describing the full normalization
algorithm in numbered steps (1-4). Each module and key function references
back to the specific algorithm step it implements (e.g. 'step 2a', 'step 2d-2e')
so the high-level algorithm can be traced through the code.
Each algorithm step now shows a concrete input → normalized example
so the transformations are immediately obvious without reading the code.
@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer-v2 branch from ea89296 to 9f6c997 Compare March 25, 2026 22:42
@swalkinshaw
Copy link
Copy Markdown
Contributor Author

@ravangen updated those tests and do another pass on them to add about ~7 more edge cases

@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer-v2 branch from 9f6c997 to a6f790b Compare March 25, 2026 22:47
Copy link
Copy Markdown
Contributor

@ravangen ravangen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more thoughts. I'm excited for this!

#[test]
fn duplicate_fields_preserved() {
let doc = parse("{ a a a }");
assert_eq!(normalize(&doc, None).unwrap(), "query{a a a}");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting in that we could decide to further merge them into "query{a}" due to same name/arguments/directives, but I imagine it isn't worth the compute time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it would get much more complex due to child selection merging so I think this is fine and stable enough

@Ravinlathigra
Copy link
Copy Markdown

Some AI feedback Mr Claude shared --
Correctness Issues

  1. swap_remove in merge breaks determinism (bug)

In build.rs lines ~781-789, normalize_in_place uses swap_remove to merge
inline fragments. swap_remove moves the last element into the removed
position, which changes the relative order of unprocessed elements. While the
final sort_unstable_by at the end of the function rescues the top-level order,
the merged children from swap_removed fragments are appended in a
non-deterministic order (depending on which fragments were at the end of the
vec). After merging, the children are re-sorted, so this is actually safe for
the final output. However, swap_remove could cause the loop to skip an
element: after swapping index j with the last element, the new element at j is
never checked for merging with i. This is a correctness bug — it could miss a
merge if there are 3+ fragments with the same (type_condition, directives).

  1. Transitive fragments don't merge — this may be intentional but worth
    considering

The test transitive_fragments_expanded shows:
query{...on T{a ...on T{b ...on T{c}}}}
Fragment A on T contains ...B (also on T), which contains ...C (also on T).
The result nests them rather than merging into a single ...on T{a b c}. This
happens because merging only occurs at a single level. Since all three share
the same type condition, a fully-normalized form could merge them. Whether
this matters depends on your use case — if two queries use different fragment
structures but resolve to the same fields on the same type, they'd get
different signatures. If the goal is purely "same structure = same signature",
this is a gap. If the goal is "same query text modulo cosmetics = same
signature", this is fine.

  1. Duplicate fields are preserved (intentional but worth noting)

{ a a a } normalizes to query{a a a}. This means a query selecting a once vs.
three times gets different signatures. For most GraphQL servers this
distinction is meaningless (field resolution is idempotent), but for your
"same footprint" goal this seems correct — the request structure IS different.

@swalkinshaw any risks you see here?

@swalkinshaw
Copy link
Copy Markdown
Contributor Author

@Ravinlathigra

  1. valid concern but not a bug in reality. Added a test case (three_same_type_fragments_with_interleaved_other) to explicitly confirm this
  2. we could theoretically flatten in that case but it would require cross-level merging which gets very complex. It's hard to draw a hard boundary here for what normalization we will and won't do, in some cases it's just too complex and not worth it imo. Basically "normalize away the most common sources of differences with the least implementation complexity"
  3. replied to @ravangen above with the same question, but this applies to the answer above here ^

swalkinshaw and others added 2 commits March 26, 2026 10:20
- Add arguments to more DashboardQuery benchmark fields for realism
- Add tests for 3+ inline fragment merging with swap_remove
- Add test for repeated directives on fields
- Extend variable renaming test with different types/defaults
- Replace assert! with assert_eq! throughout
- Inline let bindings into assert_eq! for directness

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace "semantically equivalent" framing with explicit description of
what produces the same signature and what intentionally does not (field
dedup, cross-level fragment merging).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer-v2 branch from 66319e1 to c535fc7 Compare March 26, 2026 15:20
Argument values are now omitted entirely, keeping only `name:` per
argument. Aligns with the GraphQL ArgumentCoordinate spec format,
saves bytes, and removes a fake variable reference that added no
information.

Before: query{field(a:$_,b:$_)@dir(x:$_)}
After:  query{field(a:,b:)@dir(x:)}

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer-v2 branch from c8f53cc to 7f39e23 Compare March 26, 2026 16:52
Copy link
Copy Markdown
Contributor

@ravangen ravangen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphQL aspect looks good. Would appreciate a review from a Rust developer

Copy link
Copy Markdown

@Ravinlathigra Ravinlathigra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this Scott!

Copy link
Copy Markdown
Contributor

@adampetro adampetro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

Did you want to add a line in the README with this new crate?

pub(crate) fn build_selections<'a, 'bump, E: ExecutableDocument + 'a>(
selection_set: &'a E::SelectionSet,
fragment_defs: &[(&'a str, &'a E::FragmentDefinition)],
expanding: &mut Vec<&'a str>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing you benchmarked and vec is faster than HashSet given the list is usually small?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And similarly is that why this doesn't go through the bump allocator?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep exactly since the stack is tiny here, plus vec matches the push/pop semantics better. That's also why it doesn't use bump allocator because it's re-used and growing/shrinking. And more technically, the bump allocator can't free individual items.

// Step 2b: expand fragment spreads into inline fragments
SelectionReference::FragmentSpread(spread) => {
let name = spread.name();
// Cycle detection: skip if this fragment is already being expanded
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed for spec-compliant queries? or do we just have this for safety in case the query wasn't validated in advance?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this would be caught by static validation and not parsing so it does make it a little more flexible. I'm not 100% sure how this will be implemented in our application yet. It might be bundled into the full "analyze" workflow which would likely run static validation first but it might be beneficial to compute and log the signature as early as possible (even for invalid documents) so I'm inclined to keep it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we can derive this pre static validation and gracefully fail or even get an hash without being fully compliant.

expanding.push(name);

let mut directives = build_directives::<false, E>(spread.directives(), bump);
directives.extend(build_directives::<false, E>(frag_def.directives(), bump));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this because we have no way of knowing if the directives on the fragment definition have an impact on what gets spread so we're being conservative and including them?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

being conservative here yeah which helps to distinguish between ...F @skip and ...F for example

Comment on lines +141 to +143
if let NormalizedSelection::InlineFragment(inf) = removed {
if let NormalizedSelection::InlineFragment(ref mut target) =
selections[i]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm my understanding, we expect this to always be the case given how should_merge was evaluated?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, in fact we might as well encode this to make it unreachable!?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. I think it simplifies and makes it clearer

Comment on lines +116 to +128
impl std::fmt::Display for SignatureError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::OperationNotFound(name) => write!(f, "operation not found: {name}"),
Self::AmbiguousOperation => {
write!(f, "multiple operations found; specify operation name")
}
Self::NoOperations => write!(f, "no operations in document"),
}
}
}

impl std::error::Error for SignatureError {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe using the thiserror crate would simplify doing this via macro?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would if we already had the dependency 🤷‍♂️ It looks cool, but only 3 variants right now

@swalkinshaw swalkinshaw merged commit 6e950c8 into main Mar 27, 2026
4 checks passed
@swalkinshaw swalkinshaw deleted the add-operation-normalizer-v2 branch March 27, 2026 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants