Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bbl v9.0.0 {jumpbox-address,outputs} can fail #560

Closed
abg opened this issue Apr 29, 2023 · 5 comments · Fixed by #563
Closed

bbl v9.0.0 {jumpbox-address,outputs} can fail #560

abg opened this issue Apr 29, 2023 · 5 comments · Fixed by #563

Comments

@abg
Copy link
Member

abg commented Apr 29, 2023

This is a weird case we started seeing with bbl v9.0.0. Typically we store our bbl-state in an s3 bucket without the .terraform plugins, using cf-deployment-concourse-tasks with the default DELETE_TERRAFORM_PLUGINS=true param.

Most of our bbl state commands like bbl print-env just work without issue. But we found that bbl jumpbox-address and bbl outputs fails with this error:

$ bbl jumpbox-address

Run terraform output --json in vars dir: command execution failed got: exit status 1 stderr:

│ Error: Required plugins are not installed
│
│ The installed provider plugins are not consistent with the packages
│ selected in the dependency lock file:
│   - registry.terraform.io/hashicorp/aws: there is no package for registry.terraform.io/hashicorp/aws 4.65.0 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/tls: there is no package for registry.terraform.io/hashicorp/tls 4.0.4 cached in .terraform/providers
│
│ Terraform uses external plugins to integrate with a variety of different
│ infrastructure services. To download the plugins required for this
│ configuration, run:
│   terraform init

Previously this worked in bbl v8.4.111.

abg added a commit to abg/bosh-bootloader that referenced this issue Apr 29, 2023
Some bbl commands like "jumpbox-address" or "outputs" run "terraform
output" under the hood.

When the working directory points to terraform config files, modern
terraform can fail and complain that plugins are not initialized.

This happens when either plugins were omitted altogether or when the
plugins are for a different architecture (e.g. when consume linux/amd64
bbl-state on a local darwin dev environment).

This change set the working directory to the bbl-state so "terraform
output" should reliably work independent of the plugin state.

Fixes cloudfoundry#560
abg added a commit to abg/bosh-bootloader that referenced this issue Apr 29, 2023
Some bbl commands like "jumpbox-address" or "outputs" run "terraform
output" under the hood. Previously bbl set the working directory of the
terraform command to "$BBL_STATE_DIRECTORY/terraform".

When the working directory of the terraform command contains terraform
config files, modern terraform can fail and complain that plugins are
not initialized.

This errors occurs when either plugins were omitted altogether or when
the plugins are for a different architecture (e.g. when consuming
linux/amd64 bbl-state on a local darwin dev environment).

The change here sets the working directory to the top-level bbl state
directory so "terraform output" will reliably work independent of the
plugin state.

Fixes cloudfoundry#560
abg added a commit to abg/bosh-bootloader that referenced this issue May 1, 2023
Run terraform output commands with the working directory set to the
top-level bbl-state directory rather than the ./terraform subdirectory
to allow this command to succeed even with plugins have not been
initialized.

Fixes cloudfoundry#560
@ramonskie
Copy link
Contributor

ramonskie commented May 1, 2023

yes i noticed this to.
the problem lies in getting the ouptuts from the terraform outputs command.
with the newer terraform versions the plugins are required. and will provide an exit 1 when calling this.
with the older terraform versions it would not complaint and just put error to stderr and continue the ouptus vars.

i thought about maby a check if plugins are not there it would do a terraform int again to get the plugins.
but this seemed a bit counter intuitive in my mind.
and as the plugins are not that big maby it would be wiser to delete them when backing up state

i also have several tasks where we put DELETE_TERRAFORM_PLUGINS=false for the cf-deployment-concourse-tasks

abg added a commit to abg/bosh-bootloader that referenced this issue May 1, 2023
Run terraform output commands with the working directory set to the
top-level bbl-state directory rather than the ./terraform subdirectory
to allow this command to succeed even with plugins have not been
initialized.

Fixes cloudfoundry#560
@abg
Copy link
Member Author

abg commented May 3, 2023

A simple solution seems to be to just run terraform output with a working directory != ./terraform.

I had sent an initial PR (#561) along these lines, but closed it after I remembered "remote terraform state", which is possible to do with, e.g., this plan patch. However, the idea in that PR may not be so terrible. After all in bbl 8.4, deleting plugins in that use case was already broken.

For "local" state, we are already conditionally running terraform output -vars=$pathTo/vars/terraform.tfvars, so maybe changing the working directory in that case isn't too bad.

Alternatively, maybe just unconditionally running terraform init is fine too. If I understand correctly, that command is idempotent, albeit it adds some overhead. Perhaps it could just be scoped to the commands that need to read the terraform state. That handles more cases, but with additional overhead.

Conditionally reading state seems complex, and potentially brittle. Maybe one of these other ideas are reasonable. Thoughts?

ctlong added a commit to cloudfoundry/cf-deployment that referenced this issue May 3, 2023
Keeps terraform plugins when bbl-ing up the baba-yaga-bbr env. These
terraform plugins are now required (as of bbl v9) to run commands like
`bbl jumpbox-address` now, which we need to run when generating the
drats config.

See cloudfoundry/bosh-bootloader#560 for more
info.
@ramonskie
Copy link
Contributor

ramonskie commented May 3, 2023

i think running a terraform inti if terraform state exists and .terraform dir does not exist is probably the best solution for now.

i think we can put some logic here https://github.com/cloudfoundry/bosh-bootloader/blob/main/terraform/executor.go#L302

ctlong added a commit to cloudfoundry/cf-deployment that referenced this issue May 3, 2023
Keeps terraform plugins when bbl-ing up the baba-yaga-bbr env. These
terraform plugins are now required (as of bbl v9) to run commands like
`bbl jumpbox-address` now, which we need to run when generating the
drats config.

See cloudfoundry/bosh-bootloader#560 for more
info.
@ctlong
Copy link
Member

ctlong commented May 4, 2023

❓ Is it expected that bbl jumpbox-address would fail even when I've not removed the plugins in the .terraform directory?


I'd been experiencing the same issue as described above, and figured that adding the plugins would solve the issue. I'm still getting the same error unless I manually run terraform init again though...

The terraform directory within my bbl-state directory appears to have the plugins I need loaded:

$ tree -a -C
.
├── .terraform
│   └── providers
│       └── registry.terraform.io
│           └── hashicorp
│               └── google
│                   └── 4.63.1
│                       └── darwin_amd64
│                           └── terraform-provider-google_v4.63.1_x5
├── .terraform.lock.hcl
└── bbl-template.tf

8 directories, 3 files

But when I upload my environment repo to a concourse worker with BBL_STATE_DIR set to the bbl-state directory in that repo, bbl jumpbox-address still fails:

$ cd <environment path>/bbl-state
$ bbl --version
bbl 9.0.0 (linux/amd64)
$ bbl jumpbox-address

Run terraform output --json in vars dir: command execution failed got: exit status 1 stderr:
 ╷
│ Error: Required plugins are not installed
│
│ The installed provider plugins are not consistent with the packages
│ selected in the dependency lock file:
│   - registry.terraform.io/hashicorp/google: there is no package for registry.terraform.io/hashicorp/google 4.63.1 cached in .terraform/providers
│
│ Terraform uses external plugins to integrate with a variety of different
│ infrastructure services. To download the plugins required for this
│ configuration, run:
│   terraform init
╵

🤔 Weirdly, terraform itself seems to work:

$ cd <environment path>/bbl-state
$ terraform --version
Terraform v1.4.6
on linux_amd64
$ terraform output -json -state=vars/terraform.tfstate
{
  "director__internal_ip": {
    "sensitive": false,
    "type": "string",
    "value": "10.0.0.6"
  },...
  }
}

However, even though I can see that the plugin files are still present in the .terraform directory, they don't seem to be registered properly:

$ cd <environment path>/bbl-state/terraform
$ terraform providers
╷
│ Error: Required plugins are not installed
│
│ The installed provider plugins are not consistent with the packages selected in the dependency lock file:
│   - registry.terraform.io/hashicorp/google: there is no package for registry.terraform.io/hashicorp/google 4.63.1 cached in .terraform/providers
│
│ Terraform uses external plugins to integrate with a variety of different infrastructure services. To download the plugins required for this configuration, run:
│   terraform init
╵

After I run terraform init in <environment path>/bbl-state/terraform, everything just works.

ctlong added a commit to cloudfoundry/cf-deployment-concourse-tasks that referenced this issue May 4, 2023
Adds `terraform init` call to bbl setup in order to ensure that certain
bbl commands succeed.

See cloudfoundry/bosh-bootloader#560 for more
info.
@abg
Copy link
Member Author

abg commented May 4, 2023

Offhand, it looks like you have plugins for darwin_amd64 but are running bbl jumpbox-address on linux_amd64. Is that right?

We do something similar on our team where we might locally bbl up on our workstations (darwin_{amd64,arm64}) and then consume that bbl state in CI (linux_amd64).

Weirdly, terraform itself seems to work:
$ terraform output -json -state=vars/terraform.tfstate
...

Running terraform directly will fail if your working directory is $BBL_STATE_DIRECTORY/terraform. Modern terraform seems to want to load plugins before reading the local state, I suppose in case configuration is pointing to state stored in some blobstore backend.

$ cd $BBL_STATE_DIRECTORY
$ terraform output -json -state=vars/terraform.tfstate
{
  "director__internal_ip": {
    "sensitive": false,
...
$ cd $BBL_STATE_DIRECTORY/terraform
$ terraform output -json -state=vars/terraform.tfstate
...
│ Error: Required plugins are not installed
...

bbl currently sets the working directory to the ./terraform subdirectory, presumably to handle cases where the terraform state is stored remotely.

#561 attempted to fix this problem by just changing the terraform working directory to not be $BBL_STATE_DIRECTORY/terraform. That felt brittle and there are existing bbl plan patches that store terraform state remotely. 🤷‍♂️ Those plan patches don't work for all the bbl workflows - I think bbl destroy still requires local vars/terraform.tfstate, for instance.

#563 attempted to fix this problem by terraform init when plugins are entirely missing. That doesn't fix this case apparently, when plugins are present but for the wrong architecture. So it would potentially works for the cf-deployment-concourse-tasks DELETE_TERRAFORM_PLUGINS=true case, but not when DELETE_TERRAFORM_PLUGINS=false.

Thinking about this more broadly, I'm not sure why bbl reads terraform state for bbl jumpbox-address in the first place. bbl-state.json already appears to have jumpbox.url. Is there a reason we cannot read jumpbox-address the same way we read director-address? I.e. from bbl-state.json. That seems like a much simpler change to make than either of these other PRs. 😑

terraform init may still be useful for bbl outputs, I suppose. Maybe that should be run "unconditionally" before reading terraform state to pave over these kinds of cases.

ctlong added a commit to cloudfoundry/cf-deployment-concourse-tasks that referenced this issue May 4, 2023
Runs `terraform init` if we're not deleting the terraform plugins, in
order to make sure that all of the relevant plugins have been downloaded
(at least for the architecture of the concourse container, likely
linux).

See cloudfoundry/bosh-bootloader#560 for more
info.
ctlong added a commit to cloudfoundry/cf-deployment that referenced this issue May 5, 2023
Removes the network-lb-gcp plan patch from baba-yaga-bbr. This plan
patch has dubious value, and is broken with the new bbl. We should rip
it out and see if everything still works.

Changes the way that bbr group acquires and releases the lock for the
env. Previously, `add_claimed` and `remove` were used, which was
creating a whole new lockfile and then deleting it.. not sure why we
wanted to do that. So this change switches to using `acquire` and
`release`, which should be a valid replacement because we only expect
there to be one lock file in this pool.

Keeps terraform plugins when bbl-ing up the baba-yaga-bbr env. These
terraform plugins are now required (as of bbl v9) to run commands like
`bbl jumpbox-address` now, which we need to run when generating the
drats config.

See cloudfoundry/bosh-bootloader#560 for more
info.
ctlong added a commit to cloudfoundry/cf-deployment-concourse-tasks that referenced this issue May 5, 2023
Runs `terraform init` if we're not deleting the terraform plugins, in
order to make sure that all of the relevant plugins have been downloaded
(at least for the architecture of the concourse container, likely
linux).

See cloudfoundry/bosh-bootloader#560 for more
info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
4 participants