Increase robustness of operational datastore #563

troglobit · 2024-08-17T09:21:25Z

Description

This PR increases the robustness of the Infix yanger helper. Basically, instead of bailing out with sys.exit(1), causing statd to fail and automatically cause a non-working operational datastore, we allow external commands to fail and then return empty objects/arrays to the respective data parsers.

For example, asking the operational datastore for the status of a container before the container has been started properly should not b0rk the datastore, but rather return None information. Same as if listing all containers.

The test framework has also been made slightly more fault tolerant by adding a retry of container commands.

Fixes #558 and #568

Checklist

Tick relevant boxes, this PR is-a or has-a:

Simplify and add paragraph on how to select proto from test-sh. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

troglobit · 2024-08-19T11:32:36Z

Ugly intermediate state of this PR, which still shows the general direction where I'd like to take this:

make yanger more robust so that it does not break the operational ds
add more logging to yanger in cases where user gets an empty [] back
allow test framework to retry more, e.g., container hasn't stopped/started yet

src/statd/python/yanger/yanger.py

wkz

Yanger is not my forte, but I tried to contribute some general feedback at least 😄

src/statd/python/yanger/yanger.py

test/infamy/container.py

test/test.mk

wkz

The smaller diff to yanger allowed me see some other potential issues. Sorry I missed these initially!

src/statd/python/yanger/yanger.py

test/env

These changes are intended to make yanger a bit more robust. Instead of bailing out on command errors, we now return empty string or object back to the callee in run_cmd() and run_json_cmd(). One example where yanger previously exited hard, and dragged with it the entire session, is when inspecting container status or when attempting to restart container instances which have yet to be created/stopped. Fixes #558 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

When unit tests run in CI we may not have a syslog daemon available, in such cases we set up a dummy handler. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

If a container has not yet stopped/started we may for proto RESTCONF get "invalid URI" result back for some container actions. With this change we allow the action to be retried up to three times before passing on the error. In tests on Qemu (x86_64) this happens very rarely and need at most one retry before succeeding. Verified by iterating the same basic test over night (9000+ iterations). Fixes #558 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

Fixes #568 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

wkz

Squeaky clean! ✨

troglobit · 2024-08-22T15:07:10Z

Squeaky clean! ✨

Thank you! Your last suggestion btw, about allowing errors to bubble up, already uncovered a few missing test files and error handling in tests. Which @mattiaswal is working on right now 😎

doc: update section on transport protocol

331a911

Simplify and add paragraph on how to select proto from test-sh. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

troglobit requested review from mattiaswal and wkz August 19, 2024 11:29

mattiaswal requested changes Aug 20, 2024

View reviewed changes

src/statd/python/yanger/yanger.py Outdated Show resolved Hide resolved

src/statd/python/yanger/yanger.py Outdated Show resolved Hide resolved

src/statd/python/yanger/yanger.py Outdated Show resolved Hide resolved

troglobit force-pushed the test-reset-fail branch from f577952 to c770add Compare August 21, 2024 15:49

troglobit marked this pull request as ready for review August 21, 2024 15:56

wkz approved these changes Aug 21, 2024

View reviewed changes

src/statd/python/yanger/yanger.py Outdated Show resolved Hide resolved

test/infamy/container.py Outdated Show resolved Hide resolved

test/infamy/container.py Outdated Show resolved Hide resolved

test/test.mk Show resolved Hide resolved

mattiaswal self-requested a review August 22, 2024 07:29

mattiaswal approved these changes Aug 22, 2024

View reviewed changes

troglobit force-pushed the test-reset-fail branch from 180abb6 to 6a4ea6d Compare August 22, 2024 08:57

troglobit linked an issue Aug 22, 2024 that may be closed by this pull request

CI: Unit tests fail almost every time due to race condition #568

Closed

wkz requested changes Aug 22, 2024

View reviewed changes

troglobit added 4 commits August 22, 2024 14:48

statd: add logger fallback for unit tests

1eee29f

When unit tests run in CI we may not have a syslog daemon available, in such cases we set up a dummy handler. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

test: run all CI tests with a random container name

063c49c

Fixes #568 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>

troglobit force-pushed the test-reset-fail branch from 6a4ea6d to 063c49c Compare August 22, 2024 12:49

troglobit requested a review from wkz August 22, 2024 12:49

wkz approved these changes Aug 22, 2024

View reviewed changes

yanger: Use correct file for system-output

251587c

troglobit merged commit 1c2a491 into main Aug 22, 2024
4 checks passed

troglobit deleted the test-reset-fail branch August 22, 2024 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase robustness of operational datastore #563

Increase robustness of operational datastore #563

troglobit commented Aug 17, 2024 •

edited

Loading

troglobit commented Aug 19, 2024

wkz left a comment

wkz left a comment

wkz left a comment

troglobit commented Aug 22, 2024

Increase robustness of operational datastore #563

Increase robustness of operational datastore #563

Conversation

troglobit commented Aug 17, 2024 • edited Loading

Description

Checklist

troglobit commented Aug 19, 2024

wkz left a comment

Choose a reason for hiding this comment

wkz left a comment

Choose a reason for hiding this comment

wkz left a comment

Choose a reason for hiding this comment

troglobit commented Aug 22, 2024

troglobit commented Aug 17, 2024 •

edited

Loading