Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch compactor to new BasicLifecycler with AutoForgetDelegate #1178

Merged
merged 6 commits into from
Dec 16, 2021

Conversation

mdisibio
Copy link
Contributor

@mdisibio mdisibio commented Dec 15, 2021

What this PR does:
This PR switches the compactor ring from Lifecycler to the new pluggable BasicLifeCycler, and adds AutoForgetDelegate to automatically forget unhealthy entries.

Notes:

  • Autoforget defaults to 2xHeartbeatTimeout=2M, which is a middle ground between other components with auto-forget loki ingester (1x) and cortex storegateway (10x). Would like feedback on if this should be exposed as a config option, and preferences on different default behavior, if any.

  • There is slightly more leg-work required to use BasicLifecycler such creating the KV client, and the new OnRingInstance* callbacks. Considered locating parts of this in tempo/pkg/ring for reuse, but the numerous small differences in how each component interacts with the ring (ingester vs compactor, for example), make it seem not very beneficial. Can revisit later if/when more components are switched to BasicLifecycler.

  • There are a few other changes to make the compactor startup leverage existing ring functionality, for waiting for ring activation and topology stabilization.

Which issue(s) this PR fixes:
Fixes #1081

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree some of this boilerplate is awkward. Will follow your lead on making any of this shared 👍

A few comments/questions.

modules/compactor/compactor.go Show resolved Hide resolved
modules/compactor/compactor.go Show resolved Hide resolved
modules/compactor/compactor.go Show resolved Hide resolved
modules/compactor/compactor.go Show resolved Hide resolved
modules/compactor/compactor.go Outdated Show resolved Hide resolved
@joe-elliott
Copy link
Member

Oh, one more question, if a compactor is auto-forgot and then reappears what happens? Does it cleanly reinsert itself into the ring?

@mdisibio
Copy link
Contributor Author

Oh, one more question, if a compactor is auto-forgot and then reappears what happens? Does it cleanly reinsert itself into the ring?

It depends on what you mean by cleanly. If the compactor was restarted since being auto-forget (i.e. a new pod), since there is no token persistence, it will rejoin with new tokens (in this case there is no difference between a restarted compactor or a new compactor). If the compactor remained running (but temporarily lost ability to heartbeat to the ring), I think it should rejoin with the same tokens.

@mdisibio mdisibio merged commit 9f0a021 into grafana:main Dec 16, 2021
@mdisibio mdisibio deleted the compactor-auto-forget branch January 6, 2022 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unhealthy compactors do not leave the ring after rollout
2 participants