-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offline entity usage metrics for 2024.2 #1198
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far, just leaving some initial thoughts!
// submission create loggedAt timestamp (when sub was created) to when | ||
// the event was processed, which will be after the entity version was | ||
// created. | ||
const measureEntityProcessingTime = () => ({ one }) => one(sql` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this exclude offline updates that are force-processed from the backlog? I think those would skew the metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is another place where getodk/central#720 would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the discussion below, I think this isn't an issue after all. Even if an offline update ends up in the backlog, that's different from how quickly the submission.create
event is processed.
// submission create loggedAt timestamp (when sub was created) to when | ||
// the event was processed, which will be after the entity version was | ||
// created. | ||
const measureEntityProcessingTime = () => ({ one }) => one(sql` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the same reason, it'd be great if we could exclude submission.create
events that were processed when the approvalRequired
flag was toggled, as we expect a delay in that case. I'm not really sure how to identify that though. 🤔 If it helped, I think it'd be OK to exclude entity creations (entity defs whose version number is 1), since those are the only entity defs that the approvalRequired
flag will affect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query is kinda indirect in a way that I think is convenient for the metric we are getting at.
It's looking at the timestamps on the submission.create
audit event which only gets updated by the jobs worker, the one that runs the first time the event comes in. The other things that process submission events to make entities (like the batch processing when toggling the approvalRequired flag and batch-processing the backlog) use the event to look up information, but they don't update anything about the underlying event.
So I think it's safe to to look at this time difference like this. It's not every submission create event, it's only the ones that ultimately turn into entity versions. It doesn't capture how long it takes overall for a submission to become an entity (potentially factoring in these other sources of delay), but it does capture how long a submission would be sitting around waiting for any initial entity processing (even if the result of that was it getting skipped to be reexamined later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe what you were expecting was a more explicit (entity def creation date - source submission def creation date). I could add a query for that, too, that could include these other ways in which entity processing can get delayed. Then we could see if there's a noticeable difference... which would mean these delay-related mechanism are activated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking at the timestamps on the
submission.create
audit event which only gets updated by the jobs worker, the one that runs the first time the event comes in.
Ah that's interesting, I hadn't put that together. This sounds like a good way to approach the metric then. 👍 I don't feel like we need anything more explicit: it makes sense to me to start with this.
I think metric 5 is also one that would benefit from getodk/central#720. |
Closes getodk/central#653
This covers these metrics:
submission.reprocess events
have there been?It doesn't include
TODO:
5. How long does it take for an entire branch to be processed? Specifically: what is the max amount of time between when the first submission for a branch is received and when the last submission is processed?
What has been done to verify that this works as intended?
Why is this the best possible solution? Were any other approaches considered?
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.
Before submitting this PR, please make sure you have:
make test
and confirmed all checks still pass OR confirm CircleCI build passes