Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Shovels: support old (pre-3.13.0, 3.12.8) and new supervisor child ID formats #9909

Merged
merged 2 commits into from
Nov 14, 2023

Conversation

michaelklishin
Copy link
Member

During a rolling upgrade, all cluster nodes collectively may (and usually will, due to Shovel migration during node restarts) contain mirrored_supervisor children with IDs that use two different parameters (see referenced commits below).

The old format should not trip up node startup, so new nodes must accept it in a few places, and try to use these older values during dynamic Shovel spec cleanup.

References ccc22cb, 5f0981c, #9785.

See #9894.

During a rolling upgrade, all cluster nodes collectively
may (and usually will, due to Shovel migration during node restarts)
contain mirrored_supervisor children with IDs that use two different
parameters (see referenced commits below).

The old format should not trip up node startup, so new
nodes must accept it in a few places, and try to use
these older values during dynamic Shovel spec cleanup.

References ccc22cb, 5f0981c, #9785.

See #9894.
@michaelklishin michaelklishin changed the title Dynamic Shovels: support old and new supervisor child ID formats Dynamic Shovels: support old (pre-3.13.0, 3.12.8) and new supervisor child ID formats Nov 13, 2023
@michaelklishin michaelklishin marked this pull request as draft November 13, 2023 03:55
@michaelklishin
Copy link
Member Author

This fails with

2023-11-13 03:51:06.204508+00:00 [debug] <0.1576.0> Asked to start a dynamic Shovel named 'my-dynamic' in virtual host 'v'
2023-11-13 03:51:06.204978+00:00 [debug] <0.1566.0> Shovel 'my-dynamic' connected to destination
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>   crasher:
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>     initial call: cowboy_stream_h:request_process/3
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>     pid: <0.1576.0>
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>     registered_name: []
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>     exception error: {khepri_ex,invalid_path,
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                 #{path =>
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                       [rabbit_db_msup,
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                        mirrored_supervisor_childspec,
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                        rabbit_shovel_dyn_worker_sup_sup,
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                        [<<"/">>,<<"my-dynamic">>],
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                        {<<"/">>,<<"my-dynamic">>}],
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>                                   component => [<<"/">>,<<"my-dynamic">>]}}
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in function  khepri_path:ensure_is_valid/1 (khepri_path.erl, line 611)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from khepri_machine:fold/5 (khepri_machine.erl, line 170)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from khepri_adv:get/3 (khepri_adv.erl, line 167)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from khepri:get/3 (khepri.erl, line 737)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from rabbit_db_msup:find_mirror_in_khepri/2 (rabbit_db_msup.erl, line 218)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from mirrored_supervisor:find_call/3 (mirrored_supervisor.erl, line 206)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from rabbit_shovel_dyn_worker_sup_sup:'-cleanup_specs/0-fun-3-'/2 (rabbit_shovel_dyn_worker_sup_sup.erl, line 97)
2023-11-13 03:51:06.204779+00:00 [error] <0.1576.0>       in call from sets:fold_bucket/3 (sets.erl, line 512)

when Khepri is enabled.

So that they do not make it to Khepri.
@michaelklishin michaelklishin marked this pull request as ready for review November 13, 2023 04:29
@gomoripeti
Copy link
Contributor

tested with an upgrade from 3.12.6 -> 3.12.8+patch. Everything worked fine.
(rabbit_khepri is undefined on 3.12 so that needs to be taken care of, when backporting)

@michaelklishin
Copy link
Member Author

Yes, rabbit_khepri does not exist in 3.12 but neither does the problem with some parts of the key not being accepted by the schema database. Mnesia would accept anything, so the entire condition can be dropped.

@michaelklishin michaelklishin merged commit 2adec32 into main Nov 14, 2023
15 checks passed
@michaelklishin michaelklishin deleted the rabbitmq-server-9894 branch November 14, 2023 12:10
michaelklishin added a commit that referenced this pull request Nov 14, 2023
michaelklishin added a commit that referenced this pull request Nov 15, 2023
Dynamic Shovels: support old (pre-3.13.0, 3.12.8) and new supervisor child ID formats (backport #9909)
gomoripeti pushed a commit to cloudamqp/rabbitmq-server that referenced this pull request Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants