Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/tikv: keepalive with pd (#14118) #14233

Merged
merged 3 commits into from
Dec 27, 2019

Conversation

nolouch
Copy link
Member

@nolouch nolouch commented Dec 25, 2019

What problem does this PR solve?

cherry-pick #14118

After all 3 instances of PD are killed in AWS(k8s environment), it takes a long time (15 minutes) for TiDB server instances to reconnect to new PD instances. and we found the stale TCP connection after all pod IP is changed.

 we see the connection is still establish, and that ip does not exist after kill.
10.0.48.116 is not exist, but the tcp is establisted.

Tue Dec 17 12:54:36 UTC 2019
tcp        0      1 10.0.38.181:53424       172.20.62.78:2379       SYN_SENT    1/tidb-server
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0   1140 10.0.38.181:35378       10.0.48.116:2379        ESTABLISHED 1/tidb-server
Tue Dec 17 12:54:37 UTC 2019
tcp        0      1 10.0.38.181:53424       172.20.62.78:2379       SYN_SENT    1/tidb-server
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0   1153 10.0.38.181:35378       10.0.48.116:2379        ESTABLISHED 1/tidb-server
Tue Dec 17 12:54:37 UTC 2019
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53488       172.20.62.78:2379       ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0   1153 10.0.38.181:35378       10.0.48.116:2379        ESTABLISHED 1/tidb-server

....

Tue Dec 17 13:07:15 UTC 2019
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53488       172.20.62.78:2379       ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0  26868 10.0.38.181:35378       10.0.48.116:2379        ESTABLISHED 1/tidb-server
Tue Dec 17 13:07:16 UTC 2019
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53488       172.20.62.78:2379       ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0  26868 10.0.38.181:35378       10.0.48.116:2379        ESTABLISHED 1/tidb-server
Tue Dec 17 13:07:16 UTC 2019
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53488       172.20.62.78:2379       ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0  26868 10.0.38.181:35378       10.0.48.116:2379        ESTABLISHED 1/tidb-server
Tue Dec 17 13:07:17 UTC 2019
tcp        0      0 10.0.38.181:53302       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53304       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53488       172.20.62.78:2379       ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:53300       10.0.22.167:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:37554       10.0.46.225:2379        ESTABLISHED 1/tidb-server
tcp        0      0 10.0.38.181:41754       10.0.55.94:2379         ESTABLISHED 1/tidb-server

This problem same as #7099. may k8s CNI dropping all packets send to the removed node(Indeterminate), that cause a stall conneciton, until kernel TCP retransmission times out and closes the connection.

What is changed and how it works?

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

jackysp
jackysp previously approved these changes Dec 26, 2019
Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jackysp
Copy link
Member

jackysp commented Dec 26, 2019

Please resolve the conflicts, @nolouch .

@nolouch
Copy link
Member Author

nolouch commented Dec 26, 2019

done @jackysp

@jackysp
Copy link
Member

jackysp commented Dec 27, 2019

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Dec 27, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Dec 27, 2019

/run-all-tests

@jackysp jackysp merged commit 6adce23 into pingcap:release-3.0 Dec 27, 2019
@nolouch nolouch deleted the keepalive-pd-3.0 branch April 8, 2020 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/can-merge Indicates a PR has been approved by a committer. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants