Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(test): flaky fabric AIO container boot #876

Closed
petermetz opened this issue Apr 27, 2021 · 0 comments · Fixed by #1300
Closed

fix(test): flaky fabric AIO container boot #876

petermetz opened this issue Apr 27, 2021 · 0 comments · Fixed by #1300
Assignees
Labels
bug Something isn't working dependencies Pull requests that update a dependency file Fabric
Milestone

Comments

@petermetz
Copy link
Member

Describe the bug

The all in one image fails to boot for the test in ./packages/cactus-plugin-ledger-connector-fabric/src/test/typescript/integration/fabric-v1-4-x/deploy-cc-from-golang-source.test.ts

To Reproduce

It's randomly occurring when the CI GH action is executed, maybe 1 out of 10 times.

Expected behavior

CI suite should be consistent either always failing or always passing for a given commit.

Logs/Stack traces

https://github.com/hyperledger/cactus/pull/845/checks?check_run_id=2449278886#step:5:14288

Screenshots

N/A

Cloud provider or hardware configuration:

GitHub Actions

Operating system name, version, build:

N/A

Hyperledger Cactus release version or commit (git rev-parse --short HEAD):

N/A

Hyperledger Cactus Plugins/Connectors Used

N/A

Additional context

N/A

cc: @takeutak @sfuji822 @hartm @jonathan-m-hamilton @AzaharaC @jordigiam @kikoncuo @petermetz @arnab-roy

@petermetz petermetz added bug Something isn't working Fabric dependencies Pull requests that update a dependency file labels Apr 27, 2021
@petermetz petermetz self-assigned this Sep 2, 2021
@petermetz petermetz added this to the v0.10.0 milestone Sep 2, 2021
petermetz added a commit to petermetz/cacti that referenced this issue Sep 2, 2021
Epic facepalm once again. Turns out the default restart try
count of supervisord is too low which leads to race conditions.
Increasing the retry count from 4 to 20 should do it, this way
the fabric-network process (see supervisord.conf file) should
be 5 times as "patient" waiting for the docker daemon to launch
within the AIO container.

What was happening before is that the fabric-network script
tried launching itself in parallel with the docker daemon, but
it would time out before the docker daemon could come online.

Fixes hyperledger#876

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz added a commit to petermetz/cacti that referenced this issue Sep 3, 2021
Epic facepalm once again. Turns out the default restart try
count of supervisord is too low which leads to race conditions.
Increasing the retry count from 4 to 20 should do it, this way
the fabric-network process (see supervisord.conf file) should
be 5 times as "patient" waiting for the docker daemon to launch
within the AIO container.

What was happening before is that the fabric-network script
tried launching itself in parallel with the docker daemon, but
it would time out before the docker daemon could come online.

Published these images as
ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries
and
ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries

Fixes hyperledger#718
Fixes hyperledger#876
Fixes hyperledger#320
Fixes hyperledger#319

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz added a commit that referenced this issue Sep 7, 2021
Epic facepalm once again. Turns out the default restart try
count of supervisord is too low which leads to race conditions.
Increasing the retry count from 4 to 20 should do it, this way
the fabric-network process (see supervisord.conf file) should
be 5 times as "patient" waiting for the docker daemon to launch
within the AIO container.

What was happening before is that the fabric-network script
tried launching itself in parallel with the docker daemon, but
it would time out before the docker daemon could come online.

Published these images as
ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries
and
ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries

Fixes #718
Fixes #876
Fixes #320
Fixes #319

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
RafaelAPB pushed a commit to RafaelAPB/blockchain-integration-framework that referenced this issue Mar 9, 2022
Epic facepalm once again. Turns out the default restart try
count of supervisord is too low which leads to race conditions.
Increasing the retry count from 4 to 20 should do it, this way
the fabric-network process (see supervisord.conf file) should
be 5 times as "patient" waiting for the docker daemon to launch
within the AIO container.

What was happening before is that the fabric-network script
tried launching itself in parallel with the docker daemon, but
it would time out before the docker daemon could come online.

Published these images as
ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries
and
ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries

Fixes hyperledger#718
Fixes hyperledger#876
Fixes hyperledger#320
Fixes hyperledger#319

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dependencies Pull requests that update a dependency file Fabric
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant