Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid index errors for deleted tenants #2781

Merged
merged 7 commits into from
Aug 14, 2023

Conversation

zalegrala
Copy link
Contributor

@zalegrala zalegrala commented Aug 10, 2023

What this PR does:

Here we ensure that when Tempo tries to delete a tenant index that has already been deleted, that this is not an error condition, and doesn't not log or count against the metrics. This can happen when an index was deleted because no blocks have been found, but for block paths that still contain spurious objects.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@zalegrala
Copy link
Contributor Author

It does not appear to be an error in s3 when you delete an object that is not present, so the s3 implementation has not been updated.

tempodb/backend/raw.go Show resolved Hide resolved
tempodb/blocklist/poller.go Outdated Show resolved Hide resolved
@@ -8,6 +8,7 @@
* [BUGFIX] Fix panic in metrics summary api [#2738](https://github.com/grafana/tempo/pull/2738) (@mdisibio)
* [BUGFIX] Fix node role auth IDMSv1 [#2760](https://github.com/grafana/tempo/pull/2760) (@coufalja)
* [BUGFIX] Only search ingester blocks that fall within the request time range. [#2783](https://github.com/grafana/tempo/pull/2783) (@joe-elliott)
* [BUGFIX] Fix incorrect metrics for index failures [#2781](https://github.com/grafana/tempo/pull/2781) (@zalegrala)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this have any impact besides metrics/extra logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm thinking of. We still leave the spurious objects in the backend, so perhaps we can have a conversation about cleaning that up also.

@zalegrala zalegrala merged commit e960fd0 into grafana:main Aug 14, 2023
14 checks passed
@zalegrala zalegrala deleted the deleteNotFoundNoErr branch August 14, 2023 22:30
@github-actions
Copy link
Contributor

Hello @joe-elliott!
Backport pull requests need to be either:

  • Pull requests which address bugs,
  • Urgent fixes which need product approval, in order to get merged,
  • Docs changes.

Please, if the current pull request addresses a bug fix, label it with the type/bug label.
If it already has the product approval, please add the product-approved label. For docs changes, please add the type/docs label.
If the pull request modifies CI behaviour, please add the type/ci label.
If none of the above applies, please consider removing the backport label and target the next major/minor release.
Thanks!

@github-actions
Copy link
Contributor

The backport to release-v2.2 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-2781-to-release-v2.2 origin/release-v2.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x e960fd03a8ee4a1208dbf4fe449a7700e61ed225
# When the conflicts are resolved, stage and commit the changes
git add . && git cherry-pick --continue

If you have the GitHub CLI installed:

# Create the PR body template
PR_BODY=$(gh pr view 2781 --json body --template 'Backport e960fd03a8ee4a1208dbf4fe449a7700e61ed225 from #2781{{ "\n\n---\n\n" }}{{ index . "body" }}')
# Push the branch to GitHub and a PR
echo "${PR_BODY}" | gh pr create --title "[release-v2.2] Avoid index errors for deleted tenants" --body-file - --label "type/bug" --label "backport" --base release-v2.2 --milestone release-v2.2 --web

Or, if you don't have the GitHub CLI installed (we recommend you install it!):

# If you don't have the GitHub CLI installed: Push the branch to GitHub and manually create a PR:
git push --set-upstream origin backport-2781-to-release-v2.2
# Remove the local backport branch
git switch main
git branch -D backport-2781-to-release-v2.2

Unless you've used the GitHub CLI above, now create a pull request where the base branch is release-v2.2 and the compare/head branch is backport-2781-to-release-v2.2.

joe-elliott pushed a commit that referenced this pull request Aug 18, 2023
* Catch backend.ErrDoesNotExist from WriteTenantIndex

* Handle tenant index deletion when already deleted

* Capture notfound from azure.Delete

* Capture notfound from gcs.Delete

* Update changelog

* Add note for why we skip ErrDoesNotExist

* Avoid handling the error twice

(cherry picked from commit e960fd0)
joe-elliott pushed a commit that referenced this pull request Aug 18, 2023
* Catch backend.ErrDoesNotExist from WriteTenantIndex

* Handle tenant index deletion when already deleted

* Capture notfound from azure.Delete

* Capture notfound from gcs.Delete

* Update changelog

* Add note for why we skip ErrDoesNotExist

* Avoid handling the error twice

(cherry picked from commit e960fd0)
joe-elliott added a commit that referenced this pull request Aug 18, 2023
* Catch backend.ErrDoesNotExist from WriteTenantIndex

* Handle tenant index deletion when already deleted

* Capture notfound from azure.Delete

* Capture notfound from gcs.Delete

* Update changelog

* Add note for why we skip ErrDoesNotExist

* Avoid handling the error twice

(cherry picked from commit e960fd0)

Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
zalegrala added a commit to zalegrala/tempo that referenced this pull request Apr 10, 2024
The tenant index deletion was originally put in as TCO win, but did not
have the desired effect and surfaced other issues in the system.

Related grafana#2678
Related grafana#2754
Related grafana#2781
Related grafana#2878
Related grafana#3115
Related grafana#3223

Due to the number of issues here, and causing considerable noise on the
pager, perhaps the right thing to do is back out the tenant deletion.

Raising here for discussion.
@zalegrala zalegrala mentioned this pull request Apr 10, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants