Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3467] add environment variable / flag to exclude certain resource types #9237

Closed
3 tasks done
Tracked by #8283
graciegoheen opened this issue Dec 6, 2023 · 17 comments · Fixed by #9756 or #9779
Closed
3 tasks done
Tracked by #8283

[CT-3467] add environment variable / flag to exclude certain resource types #9237

graciegoheen opened this issue Dec 6, 2023 · 17 comments · Fixed by #9756 or #9779
Assignees
Labels
enhancement New feature or request Impact: CC Impact: Orch user docs [docs.getdbt.com] Needs better documentation
Milestone

Comments

@graciegoheen
Copy link
Contributor

graciegoheen commented Dec 6, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

We expect folks to want to set environment defaults for whether or not unit tests should be included in dbt build/dbt test:

environment include unit tests by default
prod no
non-prod maybe?
dev yes
CI yes

Folks have a similar need to exclude other resource types from certain environments ("don't run my snapshots in dev", etc.).

In order to support this exclusion for all resource types...

Acceptance Criteria (updated from discussion below)

  1. Create a new flag -> DBT_EXCLUDE_RESOURCE_TYPE / --exclude-resource-type / etc.

    • takes last cut after we've done existing selection (--select and --exclude)
    • --exclude-resource-type is mutually exclusive with --resource-type
    • default -> don't exclude ANY resource types
    • include/exclude influences what's selected in your commands, what shows up in your logs, etc; however, all resources should still be included in the manifest artifact
    • For consistency, we should have this functionality able to be set in all 3 places: env var, project flags, and CLI flag.
      Related issue -> [CT-2169] [Bug] All global configs should also be settable in ProjectFlags #7036
  2. Ideally, we'd also expand --resource-type to also be an "include resource types" environment variable, flag, etc (but that can be out of scope of this issue if needed)

Describe alternatives you've considered

Alternative 1 -> Always remember to specify --select test_type:...

  • annoying :(

Alternative 2 -> Setting enabled equal to an environment variable for all unit tests:

...
unit_tests:
	my_project:
		+enabled: {{env_var('DBT_INCLUDE_UNIT_TESTS', true)}}
  • all unit tests would not be included in our metadata artifacts for the environments where DBT_INCLUDE_UNIT_TESTS = false :(
@graciegoheen graciegoheen added enhancement New feature or request Impact: CC labels Dec 6, 2023
@github-actions github-actions bot changed the title add environment variable to include/exclude unit tests [CT-3467] add environment variable to include/exclude unit tests Dec 6, 2023
@dbeatty10
Copy link
Contributor

This makes sense to add a method to configure this environment-wide 'cause always needing to remember to add --select test_type:... could get tedious.

Options

It seems like there are two main options for environment variable(s):

  1. Single multi-choice environment variable:
    • INCLUDE_TEST_TYPES - one or more test types (i.e., data unit)
  2. Multiple boolean environment variables:
    • DBT_INCLUDE_UNIT_TEST - select (or de-select) unit tests (possibly with associated CLI flag of --include-unit-test / --no-include-unit-test)
    • DBT_INCLUDE_DATA_TEST - select (or de-select) data tests
    • etc.

Comparison

Both options would allow for many different test types. The former would only need a single environment variable and the latter would need to add a new environment variable along with each new test type.

Similarly, both options could also be included in UserConfig (if we choose).

The latter option would be a bit easier to flip a single type on/off with either an environment variable or a CLI flag.

@graciegoheen
Copy link
Contributor Author

graciegoheen commented Dec 14, 2023

Option 1: INCLUDE_TEST_TYPES - default is data, unit

environment value
prod ["data"]
ci ["data", "unit"]
dev ["data", "unit"]

Option 2: TEST_TYPES_SELECTION - default is {include: all}

environment value
prod {"include": ["data"]} OR {"exclude": ["unit"]}
ci {"include": "all"} OR {"include": ["data", "unit"]}
dev {"include": "all"} OR {"include": ["data", "unit"]}

Option 3: INCLUDE_UNIT_TESTS - default is true

environment value
prod false
ci true
dev true

Note: If you explicitly passed --select test_type:..., we would use the passed-in selector RATHER than the environment variable. We can think of the environment variable as being the "default" for the environment.

^^ is this actually possible?

Our thoughts:

  • you set this once, then forget about it
  • option 3 is the easiest to set "just set it to be false in prod"
  • option 2 is most similar to WARN_ERROR_OPTIONS
  • option 2 is a big gross (curly brackets! lists! oh my!)
  • if we go for option 3, would we need an equivalent environment variable for every test type INCLUDE_DATA_TEST (would you ever want to exclude your data tests??)

HYPOTHESIS -> We think it's likely that there will never be more than 5 test types, maybe there's only 2 test types forever. Thus, we think option 3 is best, and options 1 & 2 are probably over-engineered given we're not going to have tons of test types.

@martynydbt
Copy link

Setting it in an environment variable is inconsistent from how we do this for other aspects. Historically we have set it in all 3.

@graciegoheen
Copy link
Contributor Author

As discussed with @gshank and @dbeatty10.... For consistency, we should have this functionality able to be set in all 3 places: env var, project flags, and CLI flag.
Related issue -> #7036

@martynydbt martynydbt added this to the v1.8 milestone Feb 12, 2024
@graciegoheen
Copy link
Contributor Author

Similar to the include_saved_query flag

include_saved_query = click.option(

Alternatively we could add a new environment variable for DBT_RESOURCE_TYPE (since we already have the --resource-type CLI option)

@graciegoheen
Copy link
Contributor Author

graciegoheen commented Feb 22, 2024

Proposed new acceptance criteria to expand this solution to all resource types you may want to exclude from certain environments

  1. Create a new flag -> DBT_EXCLUDE_RESOURCE_TYPE / --exclude-resource-type / etc.

    • takes last cut after we've done existing selection (--select and --exclude)
    • --exclude-resource-type is mutually exclusive with --resource-type
    • default -> don't exclude ANY resource types
    • include/exclude influences what's selected in your commands, what shows up in your logs, etc; however, all resources should still be included in the manifest artifact
  2. Ideally, we'd also expand --resource-type to also be an "include resource types" environment variable, flag, etc (but that can be out of scope of this issue if needed)

  3. Rename include_saved_query to export_saved_query

    • keep old name for backwards compatibility
    • always include saved queries, but they don't do anything unless you set export_saved_query to true - this is not about excluding/including saved queries, but about whether or not the exports are run
    • when export_saved_query is false, you will see the saved queries in your log but explicitly called out as a no-op (this feels less hidden)

This will allow folks to configure:

  • exclude my unit tests in prod DBT_EXCLUDE_RESOURCE_TYPE = ['unit_test']
  • don't export your saved queries in any env except prod DBT_EXPORT_SAVED_QUERY = false
  • exclude snapshots in dev DBT_EXCLUDE_RESOURCE_TYPE= ['snapshot']
  • etc.

cc: @jtcohen6 @dbeatty10 @Jstein77

@graciegoheen graciegoheen changed the title [CT-3467] add environment variable to include/exclude unit tests [CT-3467] add environment variable / flag to exclude certain resource types Feb 22, 2024
@graciegoheen graciegoheen added the user docs [docs.getdbt.com] Needs better documentation label Feb 22, 2024
@graciegoheen
Copy link
Contributor Author

Would be great to add docs here on best practices for what to set this environment variable to (for ci, prod, etc.)

@gshank
Copy link
Contributor

gshank commented Mar 12, 2024

After various conversations, we decided that it didn't make a lot of sense to have this be settable in project flags, since it's really command specific, not project specific. So we're going to implement env vars and cli params. This applies to three commands: build, list, and clone.

@graciegoheen
Copy link
Contributor Author

Does this not apply for run or test?

@gshank
Copy link
Contributor

gshank commented Mar 13, 2024

No. The only commands that use this are those three.

@gshank
Copy link
Contributor

gshank commented Mar 13, 2024

I guess you could change that... but that would be a different ticket, I think.

@graciegoheen graciegoheen added user docs [docs.getdbt.com] Needs better documentation and removed user docs [docs.getdbt.com] Needs better documentation labels Mar 13, 2024
@polmonso
Copy link

Hello! How do you actually achieve this:

exclude snapshots in dev DBT_EXCLUDE_RESOURCE_TYPE= ['snapshot']

we also want to exclude them but couldn't figure out how other than maybe adding enabled: "{{ var('is_prod', false) | as_bool }}" or something in the config of snapshots.

@graciegoheen
Copy link
Contributor Author

You can use the --exclude-resource-type flag (docs here) or the associated environment variable DBT_EXCLUDE_RESOURCE_TYPES (see here).

@polmonso
Copy link

How do you use it so it so it applies it only when target == prod? It's the exclusion depending on the target that I couldn't figure out.

@graciegoheen
Copy link
Contributor Author

You can set the environment variable equal to something different in each of your environments. Here's some docs on how this works in dbt Cloud.

@polmonso
Copy link

In our case the environment doesn't change, only the target. So it'll go in the line of setting the exclude resource type variable depending on the target using jinja in the dbt_project.yml.
Thanks for the pointers!

@dbeatty10
Copy link
Contributor

graciegoheen commented on Mar 12:

Does this not apply for run or test?

gshank commented on Mar 13:

No. The only commands that use this are those three.
I guess you could change that... but that would be a different ticket, I think.

#10656 is the ticket that adds this for dbt test (since it supports two different resource types: unit tests and data tests) but not dbt run (since it only supports a single resource type: models).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Impact: CC Impact: Orch user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
5 participants