Skip to content

Conversation

@HighW4y2H3ll
Copy link
Member

#66164 changed the hashing in SampleContextFrame from std::hash to MD5 in a very hot function (ContextTrieNode::getOrCrateChildContext()) in llvm-profgen. This creates over 2x run time regression when running llvm-profgen with csspgo preinliner enabled, since the MD5 computation is tripled comparing to the Murmur hash in the std library. An llvm-profgen run time comparison shows follows:

$ time llvm-profgen -binary $BINARY--perfscript $SAMPLES --populate-profile-symbol-list --show-density --output=XXX

# MD5 hash
real    105m31.644s
user    104m51.334s
sys     0m35.033s

# std::hash
real    46m0.340s
user    45m17.998s
sys     0m38.420s

Can confirm that this patch recovers the run time regression in llvm-profgen, and the perf testing in our internal services shows neutral.

@llvmbot llvmbot added the PGO Profile Guided Optimizations label Feb 9, 2026
@llvmbot
Copy link
Member

llvmbot commented Feb 9, 2026

@llvm/pr-subscribers-pgo

Author: None (HighW4y2H3ll)

Changes

#66164 changed the hashing in SampleContextFrame from std::hash to MD5 in a very hot function (ContextTrieNode::getOrCrateChildContext()) in llvm-profgen. This creates over 2x run time regression when running llvm-profgen with csspgo preinliner enabled, since the MD5 computation is tripled comparing to the Murmur hash in the std library. An llvm-profgen run time comparison shows follows:

$ time llvm-profgen -binary $BINARY--perfscript $SAMPLES --populate-profile-symbol-list --show-density --output=XXX

# MD5 hash
real    105m31.644s
user    104m51.334s
sys     0m35.033s

# std::hash
real    46m0.340s
user    45m17.998s
sys     0m38.420s

Can confirm that this patch recovers the run time regression in llvm-profgen, and the perf testing in our internal services shows neutral.


Full diff: https://github.com/llvm/llvm-project/pull/180581.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/SampleProf.h (+3-1)
diff --git a/llvm/include/llvm/ProfileData/SampleProf.h b/llvm/include/llvm/ProfileData/SampleProf.h
index b75dffaff19f7..8766ab23ac1da 100644
--- a/llvm/include/llvm/ProfileData/SampleProf.h
+++ b/llvm/include/llvm/ProfileData/SampleProf.h
@@ -522,7 +522,9 @@ struct SampleContextFrame {
   }
 
   uint64_t getHashCode() const {
-    uint64_t NameHash = Func.getHashCode();
+    // Context frame hash is heavily used in llvm-profgen context-sensitive
+    // pre-inliner. Use a lightweight hashing here to avoid speed regression.
+    uint64_t NameHash = std::hash<std::string>{}(Func.str());
     uint64_t LocId = Location.getHashCode();
     return NameHash + (LocId << 5) + LocId;
   }

@HighW4y2H3ll HighW4y2H3ll changed the title [SPGO] Use std::hash instead of MD5 to avoid run time regression [SPGO] Use std::hash instead of MD5 to avoid run time regression in llvm-profgen Feb 9, 2026
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🐧 Linux x64 Test Results

  • 189867 tests passed
  • 5060 tests skipped

✅ The build succeeded and all tests passed.

uint64_t NameHash = Func.getHashCode();
// Context frame hash is heavily used in llvm-profgen context-sensitive
// pre-inliner. Use a lightweight hashing here to avoid speed regression.
uint64_t NameHash = std::hash<std::string>{}(Func.str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only need to recompute hash when the FunctionId is a string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good catch.. updated! thx

Copy link
Contributor

@apolloww apolloww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HighW4y2H3ll HighW4y2H3ll merged commit 37c3241 into llvm:main Feb 9, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants