Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added databricks_mlflow_model and databricks_mlflow_experiment resources #931

Merged
merged 34 commits into from
Nov 24, 2021

Conversation

jensendw
Copy link
Contributor

@jensendw jensendw commented Nov 19, 2021

Provides basic support for managing MLflow models and experiments, should hopefully address #861 as well.

databricks_mlflow_experiment

resource "databricks_mlflow_experiment" "test" {
  name = "/Users/myuserid/my-experiment"
  artifact_location = "dbfs:/tmp/my-experiment"
  description = "My MLflow experiment description"
}

Argument Reference

The following arguments are supported:

  • name - (Required) Name of MLflow experiment. It must be an absolute path within the Databricks workspace, e.g. /Users/<some-username>/my-experiment. For more information about changes to experiment naming conventions, see mlflow docs.
  • artifact_location - Path to dbfs:/ or s3:// artifact location of the MLflow experiment.
  • description - The description of the MLflow experiment.

databricks_mlflow_model

resource "databricks_mlflow_model" "test" {
  name = "My MLflow Model"
  description = "My MLflow model description"
  tags {
    key   = "key1"
    value = "value1"
  }
  tags {
    key   = "key2"
    value = "value2"
  }
}

Argument Reference

The following arguments are supported:

  • name - (Required) Name of MLflow model.
  • description - The description of the MLflow model.
  • tags - Tags for the MLflow model.

Daniel Jensen and others added 28 commits November 15, 2021 12:59
Correct case mismatch.
Correct case mismatch.
…orm-provider-databricks into mlflow-experiment-update
Included artifact_location in experiment and updated docs
* including artifact_location

* removing force_new from api struct

* Refactor API objects and struct names.

* Refactor API objects and struct names.

* Improve docs.
Fix acceptance test.

Co-authored-by: nsoni2 <navin_soni@intuit.com>
Co-authored-by: Jeffrey Alvarez <Jeffrey_Alvarez@intuit.com>
Co-authored-by: soninavin <soni.navin@gmail.com>
* Remove toAPIObject.
Refactor tests.

* Remove toAPIObject.
Refactor tests.

Co-authored-by: Jeffrey Alvarez <Jeffrey_Alvarez@intuit.com>
@nfx nfx self-requested a review November 19, 2021 21:38
Copy link
Contributor

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. don't create api package, unless it's reused or big. move api wrapper to resource file instead to keep consistency.
  2. comments for experiment resource are the same as for model resource.

docs/index.md Outdated
@@ -48,6 +48,10 @@ Databricks SQL
* Manage [dashboards](resources/sql_dashboard.md) and their [widgets](resources/sql_widget.md).
* Provide [global configuration for all SQL Endpoints](docs/resources/sql_global_config.md)

MLFlow
* Create [MLFlow models](resources/mlflow_models.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. rename mlflow_models.md to mlflow_model.md
  2. also add an entry to README.md

docs/index.md Outdated
@@ -48,6 +48,10 @@ Databricks SQL
* Manage [dashboards](resources/sql_dashboard.md) and their [widgets](resources/sql_widget.md).
* Provide [global configuration for all SQL Endpoints](docs/resources/sql_global_config.md)

MLFlow
* Create [MLFlow models](resources/mlflow_models.md).
* Create [MLFlow experiments](resources/mlflow_experiments.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mlflow_experiments.md to mlflow_experiment.md


The following arguments are supported:

* `name` - (Required) Name of MLflow experiment. It must be an absolute path within the Databricks workspace, e.g. `/Users/<some-username>/my-experiment`. For more information about changes to experiment naming conventions, see https://docs.databricks.com/applications/mlflow/experiments.html#experiment-migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hide url in [..](url)

The following arguments are supported:

* `name` - (Required) Name of MLflow experiment. It must be an absolute path within the Databricks workspace, e.g. `/Users/<some-username>/my-experiment`. For more information about changes to experiment naming conventions, see https://docs.databricks.com/applications/mlflow/experiments.html#experiment-migration.
* `artifact_location` - Path to dbfs:/ or s3:// artificate location of the MLflow experiment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: artificate

})

return common.Resource{
Create: func(ctx context.Context, data *schema.ResourceData, c *common.DatabricksClient) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure that ResourceData is always named d - d *schema.ResourceData. it's a convention.

// MLFlowModelAPI defines the response object from the API
type Model struct {
Name string `json:"name"`
CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tf:"computed"

type Model struct {
Name string `json:"name"`
CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
LastUpdatedTimestamp int64 `json:"last_updated_timestamp,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tf:"computed"

Name string `json:"name"`
CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
LastUpdatedTimestamp int64 `json:"last_updated_timestamp,omitempty"`
UserID string `json:"user_id,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tf:"computed"

CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
LastUpdatedTimestamp int64 `json:"last_updated_timestamp,omitempty"`
UserID string `json:"user_id,omitempty"`
LatestVersions []string `json:"latest_versions,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tf:"computed"


// Read ...
func (a ModelAPI) Read(modelName string) (*Model, error) {
//var d MLFLowModelAPI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented irrelevant code

@codecov
Copy link

codecov bot commented Nov 19, 2021

Codecov Report

Merging #931 (e269506) into master (5ad456b) will increase coverage by 0.02%.
The diff coverage is 87.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #931      +/-   ##
==========================================
+ Coverage   85.30%   85.32%   +0.02%     
==========================================
  Files          93       95       +2     
  Lines        8582     8663      +81     
==========================================
+ Hits         7321     7392      +71     
- Misses        773      778       +5     
- Partials      488      493       +5     
Impacted Files Coverage Δ
mlflow/resource_model.go 84.61% <84.61%> (ø)
mlflow/resource_experiment.go 90.00% <90.00%> (ø)
provider/provider.go 94.06% <100.00%> (+0.10%) ⬆️

Daniel Jensen and others added 3 commits November 19, 2021 14:28
Co-authored-by: Jeffrey Alvarez <Jeffrey_Alvarez@intuit.com>
@kuritsu
Copy link
Contributor

kuritsu commented Nov 22, 2021

@nfx Please check our new changes, trying to address all your suggestions.
Regarding the tf:"computed" suggestion, is this really needed as this is a DTO (Data Transfer Object) for only talking to the MLFlow API, but not the Terraform internal state?

Copy link
Contributor

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better! :) please see more comments

}

// ExperimentAPI ...
type ExperimentAPI struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to ExperimentsAPI (e.g. almost all APIs are plurals of their entities)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return a.client.Post(a.context, "/mlflow/experiments/delete", d, &d)
}

///func ResourceMLFlowExperiment() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return err
}

data.Set("name", ad.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to do d.Set(...), because it's already handled by StructToData

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

data.SetId(ad.ExperimentId)
return nil
},
Read: func(ctx context.Context, data *schema.ResourceData, c *common.DatabricksClient) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please follow conventions of this software project: replace data *schema.ResourceData with d *schema.ResourceData. anyone working on other resources would immediately be confused with this argument naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

data.SetId(ad.ExperimentId)
return nil
},
Update: func(ctx context.Context, data *schema.ResourceData, c *common.DatabricksClient) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data -> d

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// Update ...
func (a ModelAPI) Update(d *ModelDto) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable naming

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


return common.Resource{
Create: func(ctx context.Context, data *schema.ResourceData, c *common.DatabricksClient) error {
var ad ModelDto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ad has no semantic meaning, while model does

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return nil
},
Read: func(ctx context.Context, data *schema.ResourceData, c *common.DatabricksClient) error {
var d Model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

},
Read: func(ctx context.Context, data *schema.ResourceData, c *common.DatabricksClient) error {
var d Model
ad, err := NewModelAPI(ctx, c).Read(data.Id())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model, err := ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return err
}

if err := common.StructToData(d, s, data); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrite it with return common.StructToData(model, s, d)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@nfx
Copy link
Contributor

nfx commented Nov 23, 2021

@kuritsu @jensendw added more comments.

Co-authored-by: Jeffrey Alvarez <Jeffrey_Alvarez@intuit.com>
@kuritsu
Copy link
Contributor

kuritsu commented Nov 23, 2021

@nfx Round 2 of suggestions addressed. Please check.

Copy link
Contributor

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kuritsu We're almost there! Make Delete() method accept a string argument and remove copy-pasted comment.

And merge in/rebase onto master branch.

And then I'm glad to squash and merge this PR

},
Delete: func(ctx context.Context, d *schema.ResourceData, c *common.DatabricksClient) error {
e := Experiment{ExperimentId: d.Id()}
return NewExperimentsAPI(ctx, c).Delete(&e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refactor Delete method to accept just a string id? E.g. we don't need any other fields than ID to delete an entity. And within that method you'll probably use that struct, but outer interface would be simple and consistent with the rest of resources.

Consistency is key, as eventually we'll try to migrate to go1.17 generics and Delete is one of the most similar methods everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Description string `json:"description,omitempty"`
}

// ResourceDashboard ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove this copy-pasted comment?...

Copy link
Contributor

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@nfx nfx changed the title MLflow support Added databricks_mlflow_model and databricks_mlflow_experiment Nov 24, 2021
@nfx nfx changed the title Added databricks_mlflow_model and databricks_mlflow_experiment Added databricks_mlflow_model and databricks_mlflow_experiment resources Nov 24, 2021
@nfx nfx merged commit a796ffa into databricks:master Nov 24, 2021
@nfx nfx added this to the v0.4.0 milestone Nov 24, 2021
michael-berk pushed a commit to michael-berk/terraform-provider-databricks that referenced this pull request Feb 15, 2023
…sources (databricks#931)

Provides basic support for managing MLflow models and experiments, should hopefully address databricks#861 as well.

# databricks_mlflow_experiment

```hcl
resource "databricks_mlflow_experiment" "test" {
  name = "/Users/myuserid/my-experiment"
  artifact_location = "dbfs:/tmp/my-experiment"
  description = "My MLflow experiment description"
}
```

## Argument Reference

The following arguments are supported:

* `name` - (Required) Name of MLflow experiment. It must be an absolute path within the Databricks workspace, e.g. `/Users/<some-username>/my-experiment`. For more information about changes to experiment naming conventions, see [mlflow docs](https://docs.databricks.com/applications/mlflow/experiments.html#experiment-migration).
* `artifact_location` - Path to dbfs:/ or s3:// artifact location of the MLflow experiment.
* `description` - The description of the MLflow experiment.

# databricks_mlflow_model

```hcl
resource "databricks_mlflow_model" "test" {
  name = "My MLflow Model"
  description = "My MLflow model description"
  tags {
    key   = "key1"
    value = "value1"
  }
  tags {
    key   = "key2"
    value = "value2"
  }
}
```

## Argument Reference

The following arguments are supported:

* `name` - (Required) Name of MLflow model.
* `description` - The description of the MLflow model.
* `tags` - Tags for the MLflow model.

---

Co-authored-by: nsoni2 <navin_soni@intuit.com>
Co-authored-by: Jeffrey Alvarez <Jeffrey_Alvarez@intuit.com>
Co-authored-by: Daniel Jensen <daniel_jensen@intuit.com>
Co-authored-by: Serge Smertin <259697+nfx@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants