Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up stable OVS PID, UDS, and OVSDB lock files before starting OVS #880

Merged
merged 1 commit into from
Jun 29, 2020

Conversation

jianjuns
Copy link
Contributor

We observed in issue #870 that ovsdb-server failed to restart with
error: "ovsdb-server: /var/run/openvswitch/ovsdb-server.pid: pidfile
check failed (No such process), aborting", until we deleted the stale
OVS PID files.
This commit deletes all stale OVS PID, UDS, and OVSDB lock files before
starting the OVS daemons.

Fixes: #870

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

These commands can only be run by members of the vmware-tanzu organization.

antoninbas
antoninbas previously approved these changes Jun 29, 2020
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would have been good to get to the root cause, but workaround LGTM

@jianjuns
Copy link
Contributor Author

would have been good to get to the root cause, but workaround LGTM

@antoninbas : maybe I should keep #870 open?

@jianjuns
Copy link
Contributor Author

/test-all

@antoninbas
Copy link
Contributor

I think we can close #870 since we addressed the issue and it shouldn't happen again with this workaround. If you want to update start_ovs with some additional comments explaining why we are deleting these files, I think it could be beneficial though.

@jianjuns
Copy link
Contributor Author

I think we can close #870 since we addressed the issue and it shouldn't happen again with this workaround. If you want to update start_ovs with some additional comments explaining why we are deleting these files, I think it could be beneficial though.

@jianjuns jianjuns closed this Jun 29, 2020
@jianjuns
Copy link
Contributor Author

I think we can close #870 since we addressed the issue and it shouldn't happen again with this workaround. If you want to update start_ovs with some additional comments explaining why we are deleting these files, I think it could be beneficial though.

Ok. Let me add the descriptions in the commit message to start_ovs.

@jianjuns jianjuns reopened this Jun 29, 2020
We observed in issue antrea-io#870 that ovsdb-server failed to restart with
error: "ovsdb-server: /var/run/openvswitch/ovsdb-server.pid: pidfile
check failed (No such process), aborting", until we deleted the stale
OVS PID files.
This commit deletes all stale OVS PID, UDS, and OVSDB lock files before
starting the OVS daemons.

Fixes: antrea-io#870
@jianjuns
Copy link
Contributor Author

/test-all

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jianjuns jianjuns merged commit f195bb7 into antrea-io:master Jun 29, 2020
@jianjuns jianjuns deleted the start_ovs branch August 25, 2020 21:54
GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Sep 22, 2020
antrea-io#880)

We observed in issue antrea-io#870 that ovsdb-server failed to restart with
error: "ovsdb-server: /var/run/openvswitch/ovsdb-server.pid: pidfile
check failed (No such process), aborting", until we deleted the stale
OVS PID files.
This commit deletes all stale OVS PID, UDS, and OVSDB lock files before
starting the OVS daemons.

Fixes: antrea-io#870
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

On a node with high memory pressure antrea-ovs container went into a constant CrashLoopBackOff loop
4 participants