Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion components/support/error/src/error_tracing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ use std::{

use parking_lot::Mutex;

use crate::breadcrumb;

static GLOBALS: Mutex<Globals> = Mutex::new(Globals::new());

pub fn report_error_to_app(type_name: String, message: String) {
Expand Down Expand Up @@ -139,7 +141,9 @@ impl RateLimiter {
// with NTP, if users manually adjust their clocks, etc. Letting an extra event
// through seems okay in this case. We should get back into a good state soon
// after.
_ => (),
_ => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little skeptical of this - I can see how extra single events might get through as the clock changes, but not how that could cause many reports per minute. What's your theory about how that actually happens in practice?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem really weird.

Here's what I'm looking at in grafana. One client is generating multiple error pings per second (I need to add a client filter, but for now I just added enough filters other that I'm pretty sure I'm only capturing one person).

I don't really understand what's going on. This is one wild theory, maybe there's something very wrong at the system level. My other theory was that the FF was restarting and clearing out the rate limiting data, but it seems impossible for FF to restart that much in practice. My only other theory was that it was part of some automation. This was really just a shot in the dark. Do you have any ideas on what could be happening?

(BTW, the unit in that graph is "errors / day" which is not right for unique clients. I think it's just one unique client, but since the interval is 2 hours it's multiplying to get 12 unique clients / day. I'm going to try to fix that today.)

breadcrumb!("error_support: checked_duration_since failed");
}
}
}
last_report.insert(component.to_string(), now);
Expand Down