Add scheduler instrumentation (timing & system resource usage)#101
Merged
AlexJones0 merged 8 commits intolowRISC:masterfrom Feb 18, 2026
Merged
Add scheduler instrumentation (timing & system resource usage)#101AlexJones0 merged 8 commits intolowRISC:masterfrom
AlexJones0 merged 8 commits intolowRISC:masterfrom
Conversation
c9523c8 to
f4b77b3
Compare
rswarbrick
reviewed
Feb 16, 2026
Contributor
rswarbrick
left a comment
There was a problem hiding this comment.
Some nitty comments, but I really like this!
78938a2 to
3d6269c
Compare
06513e3 to
63f15ca
Compare
machshev
reviewed
Feb 17, 2026
Collaborator
machshev
left a comment
There was a problem hiding this comment.
Thanks @AlexJones0! This is looking really good as a concept.
I'm not so keen on the way the instrumentation object is passed through the flow objects. It's another thing that has to be passed through all the main DVSim objects. If there is a way of avoiding this then that would be better.
I'd suggest using a global singleton for the instrumentation object which can be configured in the cli, and then a reference retrieved via a getter.
Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Implement the base class functionality for scheduler instrumentation, where new instrumentation options can be added by subclassing the `SchedulerInstrumentation` and registering the class with the `InstrumentationFactory` registry. This allows DVSim to potentially be extended to add custom scheduler instrumentation logic. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Implement timing instrumentation for the scheduler and register it with the instrumentation registry/factory. This enables instrumentation for reporting when the scheduler itself started/ended, as well as when each job that was dispatched started/ended. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Hook the instrumentation implementation into the scheduler. The scheduler now takes some instrumentation object as input and will start/stop to wrap the scheduler's lifetime, and will notify it of certain events (scheduler start, stop, job status change). On stopping instrumentation, the scheduler will also generate dump the generated metrics as a JSON file at a report path, if specified. In the future we may wish to modify this abstraction so that the scheduler itself does not handle the report writing (and instead either have some parent abstraction handle it, or the instrumentation itself), but the current scheduler architecture (without any significant refactoring) means that the status is not encapsulated with the job, and thus must be injected separately into the report by the scheduler. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Adds the `--instrument` option to the main DVSim CLI which can be used to specify anywhere from zero (default) to multiple instrumentations to use with DVSim's scheduler. This lets users of DVSim customize the level of instrumentation that they which to use and select which information is important to measure, allowing those who need data to capture more data and those who need performance to disable all instrumentation by default. This instrumentation is constructed on the command line and passed through the flow objects to the scheduler, where the instrumentation report (where one exists) is currently written as a single metrics.json file in the `reports` directory, next to the existing generated HTML reports (where these exist). Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
For instrumentation of the scheduler, this allows us to get more detailed information about system compute resource usage, including e.g. memory (RSS, VMS, swap) and per-core CPU utilization. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Adds an additional instrumentation mechanism to DVSim's scheduler to and registers it with the instrumentation factory/registry to allow users to optionally enable instrumentation of system resources. This captures a variety of useful system resources including memory utilization (RSS, VMS, swap) and CPU utilisation (percentage and time, plus per-core percentage) for both the system as a whole and specifically for the DVSim / scheduler process overhead. Since there is no "per-dispatched-job" process that is transparently available to the scheduler, the resource metrics currently reported for each job are instead an aggregate of the system resource metric samples taken over the time period for which that job was running. Currently, based on existing tools and polling frequencies, this is set to poll for system resources every 0.5 seconds (2x a second). In the future it would be nice to consider how best to make this customizable (perhaps through an additional CLI argument, or through a config file option?) Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
63f15ca to
c156eae
Compare
machshev
approved these changes
Feb 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces instrumentation to DVSim's scheduler, allowing us to collect metrics about the the scheduler's operation in each run if desired. This can be useful for:
This PR introduces two types of instrumentation: timing (start, end and duration) and system resource usage (RSS/VMS memory, swap, CPU utilization and user/system time). This makes the PR quite large, but the goal is to motivate the designed abstractions. If it makes it easier to review I can split out the last 2 commits with the resource instrumentation into a separate PR.
You can now give the
--instrumentoption on the command line:Currently this just defaults to generating the instrumentation report in
scratch/<branch>/reports/metrics.json, right next to the generated HTML reports for sim flows. In the future it might be nice to make this kind of thing more customizable.A few more thoughts to consider (maybe for future PRs?):