Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check the correct container during writeback #236

Merged
merged 1 commit into from
Aug 3, 2023
Merged

Conversation

kthui
Copy link
Contributor

@kthui kthui commented Aug 1, 2023

When a model is deleted, it is deleted from both current set and update set. The existing check during writeback is on the current set, which is largely correct, but another model load thread may load the exact same model during delete, which will write the model identifier into the current set. If the delete thread sees the "deleted" model on the current set, it can cause undefined behavior. Thus, this PR modifies the writeback to check the update set, which cannot be modified by any other threads, and resolves the undefined behavior.

Different cases, if the model is:

  1. On both current and update set
    • The model is being updated, and other threads will be blocked from updating it.
  2. On current set, but not update set
    • The model is being updated by another thread (or completed update), so do not touch it.
  3. Not on current set, but on update set
    • Invalid. The model on update set implies it is being updated, so it must also be on current set to block other threads from updating it.
  4. Not on current or update set
    • Unknown model.

In other words, this PR fixes the issue from number 2. Given number 3 is invalid, so there is no further behavior change aside from the fix.

@kthui kthui marked this pull request as ready for review August 2, 2023 01:46
@kthui kthui requested review from rmccorm4 and GuanLuo August 2, 2023 01:48
Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's get this merged even if test isn't ready yet. The python backend test is still intermittently failing and may be resolved by this.

@kthui kthui merged commit 4215fdc into main Aug 3, 2023
1 check passed
@kthui kthui deleted the jacky-repo-manager branch August 7, 2023 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants