Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka: Usage manager bugfixes #9917

Merged
merged 3 commits into from
Apr 11, 2023

Conversation

graphcareful
Copy link
Contributor

@graphcareful graphcareful commented Apr 7, 2023

Fixes a bug where cloud storage metrics were only reported correctly when queried on the leader node.

Fixes a bug where multiple configuration updates would cause an inconsistency upon next reload of usage state

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Bug Fixes

  • Fixes a bug where usage_manger would report incorrect cloud storage metrics when queried on the leader node.
  • Fixed another bug where multiple configuration updates would cause an inconsistency upon next reload of usage state

@graphcareful
Copy link
Contributor Author

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py::UsageTestCloudStorageMetrics.test_usage_manager_cloud_storage

@graphcareful
Copy link
Contributor Author

For those reviewing , this test was flaky because the admin.py class randomly chooses a node to query for usage_manager stats. The controller leader always had the correct results on it. The test has 30 seconds to complete so it has a moderate chance to grab the correct results from the follower leader. In the case it didn't it failed.

@graphcareful
Copy link
Contributor Author

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py

@graphcareful graphcareful changed the title cluster: Update cloud storage bytes on followers kafka: Usage manager bugfixes Apr 7, 2023
- Cloud storage metrics were only reported correctly when queried on the
leader node.

- This change ensures that the metric when queried is saved in memory in
the health_montior on followers

- Fixes: redpanda-data#9702
- Not calling `co_await` on mutex::get_units means no lock is actually
held.

- Fixes: redpanda-data#9647
@graphcareful
Copy link
Contributor Author

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py

@graphcareful
Copy link
Contributor Author

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py

@graphcareful graphcareful merged commit c1cc4f1 into redpanda-data:dev Apr 11, 2023
@graphcareful
Copy link
Contributor Author

/backport v23.1.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants