Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] Cannot move cluster to auto-az #1700

Open
tlecomte opened this issue Oct 27, 2022 · 6 comments
Open

[ISSUE] Cannot move cluster to auto-az #1700

tlecomte opened this issue Oct 27, 2022 · 6 comments
Labels
suppress diff issue related to configuration drift, most likely from Cluster Manager

Comments

@tlecomte
Copy link

Configuration

resource "databricks_cluster" "test_cluster" {
  cluster_name            = "test_cluster"
  spark_version           = data.databricks_spark_version.latest_lts.id
  node_type_id            = "r5.large"
  driver_node_type_id     = "r5.large"
  autotermination_minutes = 20
  num_workers             = 5
  aws_attributes {
    availability           = "SPOT"
    zone_id                = "auto"
    first_on_demand        = 0
    spot_bid_price_percent = 100
    ebs_volume_type        = "GENERAL_PURPOSE_SSD"
    ebs_volume_count       = 1
    ebs_volume_size        = 100
  }
  enable_elastic_disk = true
}

Expected Behavior

Changing "zone_id" from a specific value (ex: "us_east_1a") to "auto" actually changes the zone configuration for this cluster.

Actual Behavior

Changing "zone_id" from a specific value (ex: "us_east_1a") to "auto" does nothing. The zone stays on the previous specific value. In our case, this blocks us because we cannot start the cluster in this zone as we get AWS insufficient capacity errors. So we really want to move to "auto", even if that means recreating the cluster.

Seems like this behaviour was introduced with #937 to prevent Terraform to restart a cluster which AwsAttributes.zone_id = "auto", as this is an unneeded/unwanted behavior., but we would argue that we really want the zone change to be applied.

Steps to Reproduce

  1. Define a cluster with a specific zone (ex: aws_attributes { zone_id = "us_east_1a" })
  2. Apply
  3. Move to auto-az (aws_attributes { zone_id = "auto" })
  4. Apply

Terraform and provider versions

databricks/databricks 1.6.1

@GantZA
Copy link

GantZA commented Oct 28, 2022

We are experiencing this issue as well across 80+ AWS clusters. Manually have to edit the attribute in Databricks

@nfx nfx added the suppress diff issue related to configuration drift, most likely from Cluster Manager label Nov 15, 2022
@nfx
Copy link
Contributor

nfx commented Nov 15, 2022

Thank you for the feature request! Currently, the team operates in a limited capacity, carefully prioritizing, and we cannot provide a timeline to implement this feature. Please make a Pull Request if you'd like to see this feature sooner, and we'll guide you through the journey.

@Israphel
Copy link

I have tried using auto in my databricks job (which also create clusters) and it works. The doc doesn't mention it, but it worked. I was using 1.6.5 when I tried it.

@tejas-angelone
Copy link

tejas-angelone commented Jun 11, 2024

I am currently facing this issue when deploying the Databricks Asset Bundle using GitHub CI/CD.

Using databricks/setup-cli@main 0.221.1 version to deploy the bundle.

I need to manually update the zone_id to update it to auto.

@nchammas
Copy link

Confirming this is still an issue on the latest provider version:

$ terraform version
Terraform v1.8.5
on darwin_amd64
+ provider registry.terraform.io/databricks/databricks v1.47.0

If the cluster exists and is set to a non-auto availability zone, but your template specifies auto, Terraform does not update the cluster.

@tejas-angelone
Copy link

tejas-angelone commented Jun 12, 2024

Use of databricks/setup-cli@main with v0.219.0 works fine. I can update from non-auto AZ to auto.

The CLI uses the below versions

Terraform: 1.5.5  
Databricks: 1.40.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
suppress diff issue related to configuration drift, most likely from Cluster Manager
Projects
None yet
Development

No branches or pull requests

6 participants