fix: Handle errors from event stream callbacks#1302
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1302 +/- ##
==========================================
+ Coverage 95.17% 95.19% +0.01%
==========================================
Files 43 43
Lines 3111 3119 +8
==========================================
+ Hits 2961 2969 +8
Misses 150 150 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6eda306 to
73206be
Compare
|
The main symptom of this was that if a scan failed due to eg a stomp connection error, even if it reconnected, any subsequent scans would fail as the |
73206be to
67c37d1
Compare
|
As it stands this will continue a scan running, even if rabbitmq is down. So we will loose data from the nexus file. Looking instead at a different way to get the errors to shut things down correctly. |
deed376 to
7d6cd80
Compare
7d6cd80 to
3d95c43
Compare
abbiemery
left a comment
There was a problem hiding this comment.
I have played with this and it does make the error state server side nicer. Client side instead of a 500 error you now get this:
Not sure how much information this should contain or if this is better than the latter but I'm happy to deal with niceness discussions later.
My only complaint is, and this is a general complaint I guess rather than for this PR, it is confusing having two levels of callbacks. We should probably think about how we want to deal with people wanting callbacks, run_engine or otherwise. As this won't help the same issue if it occurs within the run_engine.
If a callback raises an exception, it shouldn't prevent other callbacks receiving the same event. It should also not raise the exception at the call site that published the event.
3d95c43 to
9b996ef
Compare
If a callback raises an exception, it shouldn't prevent other callbacks
receiving the same event. Instead, catch any exceptions and re-raise
them after all callbacks have been called as a single ExceptionGroup.
This allows the task to be aborted if any of the callbacks fail but
still allows callbacks that require the "plan failed" events to run.