-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add management acl token and save token info to file #4126
Conversation
Signed-off-by: Valina Li <valina.li@intel.com>
Signed-off-by: Valina Li <valina.li@intel.com>
Signed-off-by: Valina Li <valina.li@intel.com>
Signed-off-by: Valina Li <valina.li@intel.com>
Signed-off-by: Valina Li <valina.li@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feature.
I tested the snap. A fresh installation works but an upgrade from an older version fails. I think it boils down to consul bootstrapper failure on an upgrade, preventing the deployment of the management token:
failed to re-setup registry ACL: on 2nd time or later, failed to re-set up agent token: failed to check the agent token persistence: failed to check agent ACL token persistence property with status code= 403: ACL not found
I released a snap from your code to try:
$ snap install edgexfoundry --channel=latest/beta
edgexfoundry (beta) 2.3.0-dev.44 from Canonical✓ installed
# upgrading from 2.3.0-dev.44 to a revision built for this PR
$ snap refresh edgexfoundry --channel=latest/edge/pr-4126
error: cannot perform the following tasks:
- Start snap "edgexfoundry" (3919) services (systemctl command [start snap.edgexfoundry.security-proxy-setup.service] failed with exit status 1: Job for snap.edgexfoundry.security-proxy-setup.service failed because the control process exited with error code.
See "systemctl status snap.edgexfoundry.security-proxy-setup.service" and "journalctl -xeu snap.edgexfoundry.security-proxy-setup.service" for details.
)
Logs (security-proxy-setup):
edgexfoundry.security-proxy-setup[77946]: level=INFO ts=2022-08-17T09:50:08.939690132Z app=security-proxy-setup source=service.go:503 msg="successful to set up route authentication for `core-data`"
edgexfoundry.security-proxy-setup[77946]: level=INFO ts=2022-08-17T09:50:08.94246337Z app=security-proxy-setup source=service.go:382 msg="proxy service for consul has been set up"
edgexfoundry.security-proxy-setup[77946]: level=INFO ts=2022-08-17T09:50:08.942541798Z app=security-proxy-setup source=service.go:161 msg="try to enable service plugin for core-consul"
edgexfoundry.security-proxy-setup[77946]: level=INFO ts=2022-08-17T09:50:09.085600475Z app=security-proxy-setup source=serviceplugin.go:115 msg="successful to get service pluginID of plugin request-transformer for service consul"
edgexfoundry.security-proxy-setup[77946]: level=INFO ts=2022-08-17T09:50:09.102504921Z app=security-proxy-setup source=serviceplugin.go:149 msg="successfully deleted service plugin for service consul"
edgexfoundry.security-proxy-setup[77946]: level=ERROR ts=2022-08-17T09:50:09.10256291Z app=security-proxy-setup source=service.go:163 msg="failed to enable service plugin for core-consul: cannot retrieve Consul access token: access token file /var/snap/edgexfoundry/3919/secrets/consul-acl-token/mgmt_token.json not found"
edgexfoundry.security-proxy-setup[77946]: level=ERROR ts=2022-08-17T09:50:09.102574051Z app=security-proxy-setup source=init.go:53 msg="cannot retrieve Consul access token: access token file /var/snap/edgexfoundry/3919/secrets/consul-acl-token/mgmt_token.json not found"
systemd[1]: snap.edgexfoundry.security-proxy-setup.service: Main process exited, code=exited, status=1/FAILURE
Logs (security-consul-bootstrapper):
systemd[1]: Starting Service for snap application edgexfoundry.security-consul-bootstrapper...
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.759673927Z app=security-bootstrapper source=config.go:391 msg="Loaded service configuration from /var/snap/edgexfoundry/3919/config/security-bootstrapper/res/configuration.toml"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.760007164Z app=security-bootstrapper source=variables.go:352 msg="Variables override of 'StageGate.Registry.ACL.SentinelFilePath' by environment variable: STAGEGATE_REGISTRY_ACL_SENTINELFILEPATH=/var/snap/edgexfoundry/3919/consul/config/consul_acl_done"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.760022312Z app=security-bootstrapper source=variables.go:352 msg="Variables override of 'StageGate.Registry.Host' by environment variable: STAGEGATE_REGISTRY_HOST=localhost"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.760027842Z app=security-bootstrapper source=variables.go:352 msg="Variables override of 'StageGate.Registry.ACL.ManagementTokenPath' by environment variable: STAGEGATE_REGISTRY_ACL_MANAGEMENTTOKENPATH=/var/snap/edgexfoundry/3919/secrets/consul-acl-token/mgmt_token.json"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.760032611Z app=security-bootstrapper source=variables.go:352 msg="Variables override of 'StageGate.Registry.ACL.SecretsAdminTokenPath' by environment variable: STAGEGATE_REGISTRY_ACL_SECRETSADMINTOKENPATH=/var/snap/edgexfoundry/3919/secrets/edgex-consul/admin/token.json"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.760084289Z app=security-bootstrapper source=config.go:551 msg="Using local configuration from file (4 envVars overrides applied)"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.760108324Z app=security-bootstrapper source=command.go:108 msg="Security bootstrapper running setupRegistryACL"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.761013848Z app=security-bootstrapper source=command.go:569 msg="leader [\"127.0.0.1:8300\"] is elected"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.761024778Z app=security-bootstrapper source=command.go:525 msg="found Consul leader to set up ACL"
edgexfoundry.security-consul-bootstrapper[77748]: level=INFO ts=2022-08-17T09:49:36.761111081Z app=security-bootstrapper source=command.go:498 msg="successfully reconstructed bootstrap ACL token from /tmp/edgex/secrets/consul-acl-token/bootstrap_token.json"
edgexfoundry.security-consul-bootstrapper[77748]: level=ERROR ts=2022-08-17T09:49:36.761664071Z app=security-bootstrapper source=init.go:90 msg="failed to re-setup registry ACL: on 2nd time or later, failed to re-set up agent token: failed to check the agent token persistence: failed to check agent ACL token persistence property with status code= 403: ACL not found"
edgexfoundry.security-consul-bootstrapper[77723]: Wed Aug 17 11:49:36 CEST 2022 failed to set up Consul ACL
systemd[1]: snap.edgexfoundry.security-consul-bootstrapper.service: Deactivated successfully.
systemd[1]: Finished Service for snap application edgexfoundry.security-consul-bootstrapper.
@@ -21,7 +21,7 @@ LogLevel = "DEBUG" | |||
SNIS = [""] | |||
# RequestTimeout for proxy-setup http client caller | |||
RequestTimeout = 10 | |||
AccessTokenFile = "/tmp/edgex/secrets/consul-acl-token/bootstrap_token.json" | |||
AccessTokenFile = "/tmp/edgex/secrets/consul-acl-token/mgmt_token.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the backwards breaking change that @farshidtz talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close but not this one i think., it is the one in security-bootstrapper, we need bootstrap acl token reconstructed in the upgrading path. so we should keep that path in the configuration and environment as well...
Signed-off-by: Valina Li <valina.li@intel.com>
@farshidtz I've updated the snapcraft to add back the bootstraptoken env var, can you give it a try again; can you also try the same step from main branch because the 403 statusCode from consul, I wonder if this issue is already there before this PR. |
Codecov Report
@@ Coverage Diff @@
## main #4126 +/- ##
==========================================
- Coverage 43.61% 43.58% -0.03%
==========================================
Files 120 120
Lines 10630 10657 +27
==========================================
+ Hits 4636 4645 +9
- Misses 5607 5617 +10
- Partials 387 395 +8
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change. I no longer get errors in consul-bootstrapper but proxy-setup still fails for the same reason:
edgexfoundry.security-proxy-setup[22474]: level=ERROR ts=2022-08-18T12:46:55.172278192Z app=security-proxy-setup source=service.go:163 msg="failed to enable service plugin for core-consul: cannot retrieve Consul access token: access token file /var/snap/edgexfoundry/3922/secrets/consul-acl-token/mgmt_token.json not found"
edgexfoundry.security-proxy-setup[22474]: level=ERROR ts=2022-08-18T12:46:55.17228788Z app=security-proxy-setup source=init.go:53 msg="cannot retrieve Consul access token: access token file /var/snap/edgexfoundry/3922/secrets/consul-acl-token/mgmt_token.json not found"
Following the logs, it looks like consul-bootstrapper doesn't get to generate the management token on upgrade. I believe this is because sentinelFile already exists and he function returns (command.go:131) before calling saveACLTokens (command.go:161) which saves the management token (command.go:256).
The last thing that consul-bootstrapper does is to log "setupRegistryACL successfully done" (command.go:129) and return:
...
edgexfoundry.security-consul-bootstrapper[19087]: level=INFO ts=2022-08-18T12:43:09.516111643Z app=security-bootstrapper source=aclroles.go:129 msg="successfully created a role [app-rules-engine] for secretstore"
edgexfoundry.security-consul-bootstrapper[19087]: level=INFO ts=2022-08-18T12:43:09.517119586Z app=security-bootstrapper source=aclroles.go:129 msg="successfully created a role [device-rfid-llrp] for secretstore"
edgexfoundry.security-consul-bootstrapper[19087]: level=INFO ts=2022-08-18T12:43:09.517140195Z app=security-bootstrapper source=command.go:129 msg="setupRegistryACL successfully done"
systemd[1]: snap.edgexfoundry.security-consul-bootstrapper.service: Deactivated successfully.
systemd[1]: Finished Service for snap application edgexfoundry.security-consul-bootstrapper.
The reSetup
function called before return doesn't call saveACLTokens
either.
Thanks @farshidtz for the detailed information. So it looks like the snap's upgrading route has some special case. The reason So it seems to me that the right logic for |
Is this different from how you'd upgrade a native or dockerized deployment? In case of an upgrade, the 1st run is from an older version that doesn't generate the newly introduced management token. Currently, the second and follow up runs don't generate it either. As you pointed out, the second and follow up runs need to save the management token if it doesn't exist:
|
Signed-off-by: Valina Li <valina.li@intel.com>
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
@farshidtz thank you for testing and feedback! I have added create and save management acl token logic during reSetup now, can you verify it. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@farshidtz thank you for testing and feedback! I have added create and save management acl token logic during reSetup now, can you verify it. Thanks!
Thanks for the change @vli11. I tested again and both a fresh install and upgrade work.
For the upgrade, I needed to merge the changes from #4125 into your branch to prevent another upgrade error. That PR is waiting for committer approval to be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a regression in the /consul Kong route. Discussing offline.
LTGM. Requires simultaneous change to edgex-compose to avoid breaking /consul Kong route. Valina to link to associated PR. |
here is the edgex-compose PR related to this fix: edgexfoundry/edgex-compose#270 |
I have verified the consul kong route in docker environment with edgex-compose PR(edgexfoundry/edgex-compose#270), @farshidtz can you help to verify consul kong route on SNAP environment? thank you! |
I tried and could access Consul via Kong at I added a docs PR to use the management token in Snap's getting started: edgexfoundry/edgex-docs#841 |
great! thank you! |
@farshidtz Thanks for all the great validation work. |
@farshidtz For reference, the token used by Kong can update any part of the KV store, but can't disclose any of the other tokens. |
If your build fails due to your commit message not passing the build checks, please review the guidelines here: https://github.com/edgexfoundry/edgex-go/blob/main/.github/Contributing.md
PR Checklist
Please check if your PR fulfills the following requirements:
BREAKING CHANGE:
describing the break)Testing Instructions
New Dependency Instructions (If applicable)