Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Upgrading from OpenSearch 1.0.0 to OpenSearch 2.0.0-rc1 fails #3001

Closed
cliu123 opened this issue Apr 20, 2022 · 8 comments
Closed

[BUG] Upgrading from OpenSearch 1.0.0 to OpenSearch 2.0.0-rc1 fails #3001

cliu123 opened this issue Apr 20, 2022 · 8 comments
Labels
bug Something isn't working untriaged v2.0.0 Version 2.0.0

Comments

@cliu123
Copy link
Member

cliu123 commented Apr 20, 2022

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:
Upgrading BWC test to 2.0.0(Changes)

Expected behavior
Upgrade succeeds.

Actual behavior
Fails at node joining cluster
Failing Github Action: https://github.com/cliu123/security/runs/6099205094?check_suite_focus=true
Error:

»  Caused by: java.lang.IllegalStateException: Received message from unsupported version: [2.0.0] minimal compatible version is: [6.8.0]
»  	at org.opensearch.transport.TransportHandshaker$HandshakeResponseHandler.handleResponse(TransportHandshaker.java:168) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.TransportHandshaker$HandshakeResponseHandler.handleResponse(TransportHandshaker.java:[144](https://github.com/cliu123/security/runs/6099205094?check_suite_focus=true#step:5:144)) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:266) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundHandler.handleResponse(InboundHandler.java:258) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:[146](https://github.com/cliu123/security/runs/6099205094?check_suite_focus=true#step:5:146)) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:102) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:713) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:155) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:130) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:95) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]

Plugins
Security plugin

@cliu123 cliu123 added bug Something isn't working untriaged labels Apr 20, 2022
@peternied peternied added v2.0.0 Version 2.0.0 untriaged and removed untriaged labels Apr 20, 2022
@cliu123
Copy link
Member Author

cliu123 commented Apr 20, 2022

Thanks @VachaShah for looking into this.

@VachaShah
Copy link
Collaborator

From the logs, it looks like the fullRestartCluster succeeds, and then this error shows up in a mixed cluster scenario:

> Task :securityBwcCluster#oldVersionClusterTask1
> Task :securityBwcCluster#fullRestartClusterTask
> Task :securityBwcCluster#oldVersionClusterTask0
> Task :securityBwcCluster#mixedClusterTask

=== Standard output of node `node{::securityBwcCluster0-0}` ===

»    ↓ errors and warnings from /home/runner/work/security/security/bwc-test/build/testclusters/securityBwcCluster0-0/logs/opensearch.stdout.log ↓
» WARN ][o.o.s.OpenSearchSecurityPlugin] [securityBwcCluster0-0] OpenSearch Security plugin installed but disabled. This can expose your configuration (including passwords) to the public.
»   ↑ repeated 2 times ↑
» WARN ][o.o.g.DanglingIndicesState] [securityBwcCluster0-0] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
»   ↑ repeated 2 times ↑
» WARN ][o.o.d.FileBasedSeedHostsProvider] [securityBwcCluster0-0] expected, but did not find, a dynamic hosts list at [/home/runner/work/security/security/bwc-test/build/testclusters/securityBwcCluster0-0/config/unicast_hosts.txt]
»   ↑ repeated 16 times ↑
» WARN ][o.o.c.c.ClusterFormationFailureHelper] [securityBwcCluster0-0] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [securityBwcCluster0-0, securityBwcCluster0-1, securityBwcCluster0-2] to bootstrap a cluster: have discovered [{securityBwcCluster0-0}{9eB6z49MSTCp0whTMicsDw}{lzHHafsyTrq3Qs96JIfc7A}{127.0.0.1}{127.0.0.1:46033}{dimr}{testattr=test}]; discovery will continue using [] from hosts providers and [{securityBwcCluster0-0}{9eB6z49MSTCp0whTMicsDw}{lzHHafsyTrq3Qs96JIfc7A}{127.0.0.1}{127.0.0.1:46033}{dimr}{testattr=test}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
» WARN ][stderr                   ] [securityBwcCluster0-0] SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
» WARN ][stderr                   ] [securityBwcCluster0-0] SLF4J: Defaulting to no-operation (NOP) logger implementation
» WARN ][stderr                   ] [securityBwcCluster0-0] SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
» WARN ][o.o.c.c.JoinHelper       ] [securityBwcCluster0-0] last failed join attempt was 9[43](https://github.com/cliu123/security/runs/6100401684?check_suite_focus=true#step:5:43)ms ago, failed to join {securityBwcCluster0-1}{lueuR1abTimqG95efw9WCQ}{WfX1DwzkTny81RI5cM98EA}{127.0.0.1}{127.0.0.1:33039}{dimr}{testattr=test} with JoinRequest{sourceNode={securityBwcCluster0-0}{9eB6z49MSTCp0whTMicsDw}{Fmr_h_ooRBO4dMW25KGBvw}{127.0.0.1}{127.0.0.1:43249}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}, minimumTerm=1, optionalJoin=Optional.empty}
»  org.opensearch.transport.RemoteTransportException: [securityBwcCluster0-1][127.0.0.1:33039][internal:cluster/coordination/join]
»  Caused by: org.opensearch.transport.ConnectTransportException: [securityBwcCluster0-0][127.0.0.1:43249] general node connection failure
»  	at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.lambda$onResponse$2(TcpTransport.java:981) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.action.ActionListener$1.onFailure(ActionListener.java:84) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.TransportHandshaker$HandshakeResponseHandler.handleResponse(TransportHandshaker.java:167) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.TransportHandshaker$HandshakeResponseHandler.handleResponse(TransportHandshaker.java:1[44](https://github.com/cliu123/security/runs/6100401684?check_suite_focus=true#step:5:44)) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]
»  	at org.opensearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:2[66](https://github.com/cliu123/security/runs/6100401684?check_suite_focus=true#step:5:66)) ~[opensearch-2.0.0-rc1-SNAPSHOT.jar:2.0.0-rc1-SNAPSHOT]

@dblock
Copy link
Member

dblock commented Apr 20, 2022

Was this caught by automation?

@cliu123
Copy link
Member Author

cliu123 commented Apr 20, 2022

Was this caught by automation?

Yes, BWC tests.

@VachaShah
Copy link
Collaborator

@cliu123 Can you add here the upgrade paths that succeeded?

@cliu123
Copy link
Member Author

cliu123 commented Apr 20, 2022

@VachaShah
Copy link
Collaborator

VachaShah commented Apr 20, 2022

Thank you very much @saratvemulapalli for pointing this out - there was a bug in 1.0 which was patched in 1.0.1 with commit 7f524c7. So, you can test from 1.0.1 to 2.0.0 instead of 1.0.0 to 2.0.0.

@cliu123
Copy link
Member Author

cliu123 commented Apr 20, 2022

Thanks @VachaShah for pointing out! The OpenSearch version was 1.0.0, but security plugin version was 1.0.1.0. Corrected the issue title.

@cliu123 cliu123 changed the title [BUG] Upgrading from OpenSearch 1.0.1 to OpenSearch 2.0.0-rc1 fails [BUG] Upgrading from OpenSearch 1.0.0 to OpenSearch 2.0.0-rc1 fails Apr 20, 2022
@cliu123 cliu123 closed this as completed Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged v2.0.0 Version 2.0.0
Projects
None yet
Development

No branches or pull requests

4 participants