kafka: Usage manager bugfixes #9917

graphcareful · 2023-04-07T16:25:33Z

Fixes a bug where cloud storage metrics were only reported correctly when queried on the leader node.

Fixes a bug where multiple configuration updates would cause an inconsistency upon next reload of usage state

Backports Required

Release Notes

Bug Fixes

Fixes a bug where usage_manger would report incorrect cloud storage metrics when queried on the leader node.
Fixed another bug where multiple configuration updates would cause an inconsistency upon next reload of usage state

graphcareful · 2023-04-07T16:53:40Z

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py::UsageTestCloudStorageMetrics.test_usage_manager_cloud_storage

graphcareful · 2023-04-07T16:55:25Z

For those reviewing , this test was flaky because the admin.py class randomly chooses a node to query for usage_manager stats. The controller leader always had the correct results on it. The test has 30 seconds to complete so it has a moderate chance to grab the correct results from the follower leader. In the case it didn't it failed.

graphcareful · 2023-04-07T19:57:03Z

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py

- Cloud storage metrics were only reported correctly when queried on the leader node. - This change ensures that the metric when queried is saved in memory in the health_montior on followers - Fixes: redpanda-data#9702

- Not calling `co_await` on mutex::get_units means no lock is actually held. - Fixes: redpanda-data#9647

graphcareful · 2023-04-07T22:48:34Z

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py

graphcareful · 2023-04-10T23:23:35Z

/ci-repeat 2 skip-unit dt-repeat=50 tests/rptest/tests/usage_test.py

graphcareful · 2023-04-11T13:58:11Z

/backport v23.1.x

graphcareful requested review from dotnwat, alenkacz and VladLazar April 7, 2023 16:25

github-actions bot added the area/redpanda label Apr 7, 2023

graphcareful changed the title ~~cluster: Update cloud storage bytes on followers~~ kafka: Usage manager bugfixes Apr 7, 2023

graphcareful added 2 commits April 7, 2023 18:48

cluster: Update cloud storage bytes on followers

5a37b5d

- Cloud storage metrics were only reported correctly when queried on the leader node. - This change ensures that the metric when queried is saved in memory in the health_montior on followers - Fixes: redpanda-data#9702

kafka/s: Fix unclean restart between reset calls

9ded5ed

- Not calling `co_await` on mutex::get_units means no lock is actually held. - Fixes: redpanda-data#9647

graphcareful force-pushed the fix-cloud-usage-test branch from 4ab12ae to 9ded5ed Compare April 7, 2023 22:48

rptest: Usage pass when reported cloud usage seen

0879019

graphcareful force-pushed the fix-cloud-usage-test branch from 579b69a to 0879019 Compare April 10, 2023 18:30

dotnwat approved these changes Apr 10, 2023

View reviewed changes

graphcareful merged commit c1cc4f1 into redpanda-data:dev Apr 11, 2023

vbotbuildovich mentioned this pull request Apr 11, 2023

[v23.1.x] kafka: Usage manager bugfixes #9965

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kafka: Usage manager bugfixes #9917

kafka: Usage manager bugfixes #9917

graphcareful commented Apr 7, 2023 •

edited

Loading

graphcareful commented Apr 7, 2023

graphcareful commented Apr 7, 2023

graphcareful commented Apr 7, 2023

graphcareful commented Apr 7, 2023

graphcareful commented Apr 10, 2023

graphcareful commented Apr 11, 2023

kafka: Usage manager bugfixes #9917

kafka: Usage manager bugfixes #9917

Conversation

graphcareful commented Apr 7, 2023 • edited Loading

Backports Required

Release Notes

Bug Fixes

graphcareful commented Apr 7, 2023

graphcareful commented Apr 7, 2023

graphcareful commented Apr 7, 2023

graphcareful commented Apr 7, 2023

graphcareful commented Apr 10, 2023

graphcareful commented Apr 11, 2023

graphcareful commented Apr 7, 2023 •

edited

Loading