Handle BadImageFormatException in OpenAssociatedSymbolFile#124188
Handle BadImageFormatException in OpenAssociatedSymbolFile#124188steveisok wants to merge 1 commit intodotnet:mainfrom
Conversation
Catch BadImageFormatException from PEReader.ReadCodeViewDebugDirectoryData and skip malformed CodeView entries instead of crashing. This is consistent with how other PDB loading failures are handled in the same method. Fixes https://github.com/dotnet/dotnet/issues/4733
|
Swallowing the exception seems to be in line with the spirit of best efforting the pdb reading. Let me know if you want something else. |
There was a problem hiding this comment.
Pull request overview
This PR improves robustness of symbol (PDB) association in CoreCLR tooling by handling malformed CodeView debug directory entries that can throw BadImageFormatException, preventing dotnet-pgo / compiler tooling from crashing while scanning PE debug metadata.
Changes:
- Wrap
PEReader.ReadCodeViewDebugDirectoryDatain atry/catch (BadImageFormatException)withinOpenAssociatedSymbolFile. - Skip malformed CodeView entries and continue searching for usable debug entries, aligning with existing “best-effort” PDB loading behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/coreclr/tools/dotnet-pgo/TraceTypeSystemContext.cs | Avoids crashing when reading malformed CodeView entries during PDB discovery for dotnet-pgo. |
| src/coreclr/tools/Common/Compiler/CompilerTypeSystemContext.cs | Applies the same defensive handling to the shared compiler type system context’s PDB discovery logic. |
| { | ||
| debugDirectoryData = peReader.ReadCodeViewDebugDirectoryData(debugEntry); | ||
| } | ||
| catch (BadImageFormatException) |
There was a problem hiding this comment.
BadImageFormatException here refers to System.BadImageFormatException, but this file also deals with Internal.TypeSystem.TypeSystemException.BadImageFormatException (see TODO earlier). Qualifying the caught exception as System.BadImageFormatException would avoid ambiguity for future readers/maintainers.
| catch (BadImageFormatException) | |
| catch (System.BadImageFormatException) |
| continue; | ||
| } | ||
|
|
||
| string candidatePath = debugDirectoryData.Path; |
There was a problem hiding this comment.
There are two spaces before '=' in string candidatePath = ...; please remove the extra whitespace to match typical formatting in this file.
| string candidatePath = debugDirectoryData.Path; | |
| string candidatePath = debugDirectoryData.Path; |
| { | ||
| debugDirectoryData = peReader.ReadCodeViewDebugDirectoryData(debugEntry); | ||
| } | ||
| catch (BadImageFormatException) |
There was a problem hiding this comment.
To make it explicit that this is handling malformed PE debug data from PEReader.ReadCodeViewDebugDirectoryData, consider qualifying the caught exception as System.BadImageFormatException (this codebase also has Internal.TypeSystem.TypeSystemException.BadImageFormatException, which can be easy to confuse with).
| catch (BadImageFormatException) | |
| catch (System.BadImageFormatException) |
stephentoub
left a comment
There was a problem hiding this comment.
🤖 Copilot Code Review — PR #124188
Holistic Assessment
Motivation: The PR addresses a real operational issue (#124181) where Crossgen2 crashes on PE files with malformed CodeView debug directory data. The crash blocks SDK builds, making this a legitimate reliability fix.
Approach: Catching BadImageFormatException and continuing to the next debug entry is the correct approach. This aligns with the established "best-effort PDB" philosophy in this codebase—SafeReadDebugDirectory() (added in #73824) already handles malformed debug directory sizes, and PortablePdbSymbolReader.TryOpen returns null on failure rather than throwing.
Net positive: Yes. Small, targeted fix that prevents tool crashes without changing behavior for valid inputs.
Detailed Findings
✅ Correctness — Exception handling is appropriate
The fix correctly catches BadImageFormatException (thrown for "Unexpected CodeView data signature value") and skips the malformed entry instead of crashing. This is consistent with how the rest of the method treats missing or unreadable PDB files—it continues searching or returns null.
⚠️ Completeness — Another caller in dotnet-pgo/Program.cs may need the same fix
There is an additional call to ReadCodeViewDebugDirectoryData in src/coreclr/tools/dotnet-pgo/Program.cs around line 1306:
var codeViewData = ecmaModule.PEReader.ReadCodeViewDebugDirectoryData(debugEntry);This is in the module match/mismatch detection loop. If a malformed CodeView entry is encountered here, dotnet-pgo will still crash. Consider whether this call should also be wrapped with the same exception handling—if the CodeView data can't be read, the code could simply continue and skip that entry (treating it as "unable to validate match"), which aligns with the fallback message already printed when no match is found.
However, I note this is in a different context (module validation rather than symbol file discovery), so the author/maintainers may reasonably decide this is out of scope for this PR.
✅ Conventions — Code follows repository patterns
The exception handling style matches existing repo usage. The code is syntactically correct and follows the established try-catch-continue pattern inside loops.
💡 Suggestion — Consider a helper method (follow-up)
If ReadCodeViewDebugDirectoryData needs protection in multiple places, a future follow-up could add a SafeReadCodeViewDebugDirectoryData extension method (similar to SafeReadDebugDirectory) to centralize this pattern. Not a blocker for this PR.
Summary
LGTM with optional follow-up. The changes to the two OpenAssociatedSymbolFile implementations are correct and solve the reported crash. The unguarded call in dotnet-pgo/Program.cs is a potential secondary crash site for the same malformed CodeView scenario—consider whether it should be addressed in this PR or tracked as a follow-up.
There was a problem hiding this comment.
Why do we want to mask this error? It does not look right to me. Our general strategy in AOT compilers is to fail the compilation when the input is malformed. It is expected that the malformed input can and will cause the compiler to crash or fail
The error in the issue is either a race condition in the build where the build is overwriting the compiler inputs, or it is a memory corruption (e.g. GC hole) leading to intermittent crash.
We need to get a crash dump of the failure to tell. It is a long outstanding problem that the builds are not producing crash dumps that made investigation of many issues difficult/impossible.
|
@jkotas being able to get crashdumps when crossgen is in use is something we should put in place regardless. I'll see what I can do. Is there value in having a more graceful error here or close? |
Catch BadImageFormatException from PEReader.ReadCodeViewDebugDirectoryData and skip malformed CodeView entries instead of crashing. This is consistent with how other PDB loading failures are handled in the same method.
#124181