Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] fix delete_when_subscriptions_caught_up doesn't work while have active consumers #18283

Conversation

codelipenghui
Copy link
Contributor

Motivation

The current behavior for the delete_when_subscriptions_caught_up strategy is not expected. The active consumer will not be closed even if users enable delete_when_subscriptions_caught_up and there are no backlogs for the topic.

It should be the part that #6077 has missed. And Sijie has mentioned in the comment #6077 (review).

To correct the behavior of delete_when_subscriptions_caught_up

Modifications

Close active consumers if delete_when_subscriptions_caught_up is applied and there are no backlogs for the topic. So that the topic can be cleaned up properly by the topic GC thread.

Verifying this change

Updated the existing test.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: codelipenghui#17

@codelipenghui codelipenghui added this to the 2.12.0 milestone Nov 1, 2022
@codelipenghui codelipenghui self-assigned this Nov 1, 2022
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 1, 2022
@Technoboy- Technoboy- closed this Nov 2, 2022
@Technoboy- Technoboy- reopened this Nov 2, 2022
@codecov-commenter
Copy link

codecov-commenter commented Nov 2, 2022

Codecov Report

Merging #18283 (f8169c9) into master (0866c3a) will increase coverage by 12.94%.
The diff coverage is 64.84%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #18283       +/-   ##
=============================================
+ Coverage     38.97%   51.91%   +12.94%     
+ Complexity     8311     7411      -900     
=============================================
  Files           683      400      -283     
  Lines         67325    43679    -23646     
  Branches       7217     4483     -2734     
=============================================
- Hits          26239    22677     -3562     
+ Misses        38079    18562    -19517     
+ Partials       3007     2440      -567     
Flag Coverage Δ
unittests 51.91% <64.84%> (+12.94%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../main/java/org/apache/pulsar/PulsarStandalone.java 0.00% <ø> (ø)
...pulsar/broker/service/PulsarCommandSenderImpl.java 73.29% <ø> (-1.07%) ⬇️
...ransaction/buffer/impl/InMemTransactionBuffer.java 0.00% <ø> (ø)
...nsaction/buffer/impl/TransactionBufferDisable.java 52.63% <ø> (ø)
.../metadata/v2/TransactionBufferSnapshotIndexes.java 0.00% <0.00%> (ø)
...a/v2/TransactionBufferSnapshotIndexesMetadata.java 0.00% <0.00%> (ø)
...oker/transaction/buffer/metadata/v2/TxnIDData.java 0.00% <0.00%> (ø)
...ava/org/apache/pulsar/broker/service/Consumer.java 69.14% <50.00%> (+1.06%) ⬆️
...sar/broker/service/persistent/PersistentTopic.java 60.90% <53.12%> (+3.92%) ⬆️
...ransaction/buffer/impl/TopicTransactionBuffer.java 58.13% <60.71%> (+13.74%) ⬆️
... and 354 more

// 2. We want to kick out everyone and forcefully delete the topic.
// In this case, we shouldn't care if the usageCount is 0 or not, just proceed
if (currentUsageCount() == 0 || (closeIfClientsConnected && !failIfHasSubscriptions)) {
if (currentUsageCount() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we need to keep the original condition (closeIfClientsConnected && !failIfHasSubscriptions))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the current code is correct. Because we have called disconnection operation before. If there are still any connections at the current time, it means that a new connection has come in, we should consider that there is a concurrency problem and give up deleting the topic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right @poorbarcode

The closeIfClientsConnected and failIfHasSubscriptions is validated before we reach here. It looks like duplicated validation.

// 2. We want to kick out everyone and forcefully delete the topic.
// In this case, we shouldn't care if the usageCount is 0 or not, just proceed
if (currentUsageCount() == 0 || (closeIfClientsConnected && !failIfHasSubscriptions)) {
if (currentUsageCount() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the current code is correct. Because we have called disconnection operation before. If there are still any connections at the current time, it means that a new connection has come in, we should consider that there is a concurrency problem and give up deleting the topic.

Comment on lines 1182 to 1184
} else if (!closeIfClientsConnected && currentUsageCount() != 0 && !failIfHasBacklogs) {
return FutureUtil.failedFuture(new TopicBusyException(
"Topic has " + currentUsageCount() + " connected producers/consumers"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to fail as long as there are connections, so it has nothing to do with !failIfHasBacklogs.

Or if failIfHasBacklogs is triggered, the error message returned should not be like this

Copy link
Contributor

@poorbarcode poorbarcode Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @315157973

When failIfHasBacklogs is true, the expected behavior is that:

  • failure of any producer exists
  • if any consumer exists, close connections

Actually, the code should be like this:

if ( !closeIfClientsConnected && currentUsageCount() ){
   fail...
} else if ( !closeIfClientsConnected !failIfHasBacklogs && anyProducerExists() ){
   fail...
}

This code may not be easy to understand from the context, but the logic is correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a concurrency problem that may lead to the incorrect deletion of existing producer topics. E.g:

checkGC new producer registry
ensure no producer exists
delete topic ( false, true, false)
ensure no backlog
new producer registry
fence topic
disconnect all clients
delete topic

The above flow shows the case: producer exists but the topic is deleted. Would it be better to add a double-check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@poorbarcode poorbarcode Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@mattisonchao mattisonchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codelipenghui codelipenghui merged commit 67d9d63 into apache:master Nov 3, 2022
@congbobo184
Copy link
Contributor

could you please cherry-pick this PR to branch-2.9? thanks.

@codelipenghui
Copy link
Contributor Author

codelipenghui commented Nov 16, 2022

#18320 has cherry-picked this one to branch-2.11

@codelipenghui
Copy link
Contributor Author

f5c9354 cherry-picked to branch-2.10

@codelipenghui codelipenghui added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Nov 16, 2022
@codelipenghui
Copy link
Contributor Author

3cf167a cherry-picked to branch-2.9

congbobo184 pushed a commit that referenced this pull request Nov 27, 2022
…ile have active consumers (#18283)

(cherry picked from commit 67d9d63)
@codelipenghui codelipenghui deleted the penghui/fix-subscription-catchup-deletion branch March 26, 2024 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants