fix(test): flaky fabric AIO container boot #876

petermetz · 2021-04-27T17:52:07Z

Describe the bug

The all in one image fails to boot for the test in ./packages/cactus-plugin-ledger-connector-fabric/src/test/typescript/integration/fabric-v1-4-x/deploy-cc-from-golang-source.test.ts

To Reproduce

It's randomly occurring when the CI GH action is executed, maybe 1 out of 10 times.

Expected behavior

CI suite should be consistent either always failing or always passing for a given commit.

Logs/Stack traces

https://github.com/hyperledger/cactus/pull/845/checks?check_run_id=2449278886#step:5:14288

Screenshots

N/A

Cloud provider or hardware configuration:

GitHub Actions

Operating system name, version, build:

N/A

Hyperledger Cactus release version or commit (git rev-parse --short HEAD):

N/A

Hyperledger Cactus Plugins/Connectors Used

N/A

Additional context

N/A

cc: @takeutak @sfuji822 @hartm @jonathan-m-hamilton @AzaharaC @jordigiam @kikoncuo @petermetz @arnab-roy

The text was updated successfully, but these errors were encountered:

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Fixes hyperledger#876 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger#718 Fixes hyperledger#876 Fixes hyperledger#320 Fixes hyperledger#319 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes #718 Fixes #876 Fixes #320 Fixes #319 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger#718 Fixes hyperledger#876 Fixes hyperledger#320 Fixes hyperledger#319 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

petermetz added bug Something isn't working Fabric dependencies Pull requests that update a dependency file labels Apr 27, 2021

petermetz self-assigned this Sep 2, 2021

petermetz added this to the v0.10.0 milestone Sep 2, 2021

petermetz mentioned this issue Sep 3, 2021

fix(test): flaky fabric AIO container boot #876 #1300

Merged

petermetz closed this as completed in #1300 Sep 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(test): flaky fabric AIO container boot #876

fix(test): flaky fabric AIO container boot #876

petermetz commented Apr 27, 2021

fix(test): flaky fabric AIO container boot #876

fix(test): flaky fabric AIO container boot #876

Comments

petermetz commented Apr 27, 2021