-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/scale: test basic auth with many users #6781
tests/scale: test basic auth with many users #6781
Conversation
0926955
to
97a18a8
Compare
97a18a8
to
be3ccc9
Compare
be3ccc9
to
05e6a26
Compare
05e6a26
to
c0c3106
Compare
After #6953 merged I'm starting to see crashes and Sanitizer failures tests with small numbers of concurrent users (e.g., 10, 20, 30, 40, 50 unique users)
@BenPope any ideas on the above? And then I'm still dealing with a |
I hope this will fix it.
I'll take another look after my tests have finished running. |
c0c3106
to
0c00ac8
Compare
826f664
to
89abe1b
Compare
Oddly enough, I'm getting that gate closed exception in CDT but not on local. The test with 1k and 2k users takes approx 2min and 5min to complete. Why don't I add this test to CI instead of CDT instead? |
yeh, something like this in CI would be good. also, you might try to increase the number of cores used in CI. i think it is 2 by default, and you might need more to make the problematic condition more likely. |
This commit reduces cross-shard communication by moving the garbage collection timer into the individual sharded instances. Otherwise we risk a seastar assert failure on the shared timer. The assert failure happens when two or more sharded instances evict and then trigger garabage collection at the sametime.
Prior to this commit, all sharded state was handled in the sharded_client_cache wrapper. That wrapper existed on a single core which led to concurrency failures since there was cross-core communication betweem the wrapper and the sharded services. This commit reduces cross-core communication by moving concurrency mechanisms into the sharded service. This commit also serves as the pre-req to remove the sharded_client_cache wrapper. Finally, the auth_ctx_server is the new "frontend" for the kafka client cache. The ctx server can pass function handles to the sharded instance directly. So invoke_on_cache is no longer needed.
The wrapper is obsolete now since the auth_ctx_server calls the sharded client cache directly.
4d8312d
to
3c62bee
Compare
Previously the client (and thus the shared broker) were evicted from the Proxy client cache while do_connect was still running. This manifested as an assertion failure on the ss::outputstream. The solution is to extend the brokers lifetime.
ecf90f1
to
1244e18
Compare
1244e18
to
ba92cc3
Compare
Previously, there was a chance for numerous gate closed exceptions because the client was evicted and client::stop is called before the first request could complete. This scenario often occurs when there is heavy load on the cache with many unique users. This commit addresses the problem by including a mutex to the cached items. Then the GC process and client dispatch may use the semaphore so GC must wait for the current request to finish before stopping the client.
ba92cc3
to
44a8184
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work
@dotnwat are you around? GH says you requested changes but I don't see them. Or is this a glitch? |
Seems like a glitch |
should've been dismissed automatically when I force pushed
Cover letter
This PR adds a ducktape test that issues many concurrent rest requests with unique principals. The intent is to test that the kafka client cache can handle multiple authenticated connections. This test may also serve as a baseline so stakeholders can gauge how much load the Pandaproxy can handle with respect to the number of users.
Closes: #6764
Changes from force-push
7810f91
:Changes from force-push
4d8312d
:invoke_on_cache
capture the function handle by forwarding instead of referenceremove_client_if
Changes from force-push
ecf90f1
:Changes from force-push
1244e18
:Changes from force-push
ba92cc3
:fetch_or_insert_impl
Changes from force-push
44a8184
:fetch_or_insert
protected so it is not a leaky abstractionfetch_or_insert
so the client mutex is always used with the client.Backport Required
UX changes
Release notes