Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct child model selector #2052

Closed
bobzoller opened this issue Jan 16, 2020 · 4 comments · Fixed by #2485
Closed

Direct child model selector #2052

bobzoller opened this issue Jan 16, 2020 · 4 comments · Fixed by #2485
Labels
cli enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors!

Comments

@bobzoller
Copy link

Describe the feature

Akin to the CSS Child combinator, I'd like to be able to run a given model and its direct children only.

Describe alternatives you've considered

Writing a script to consume the dbt graph and output a list of models equivalent to this feature.

Who will this benefit?

I believe this would primarily be useful in a CI environment and could significantly decrease cycle time by trading off a bit of risk. I surmise that if a given model changes, the most likely breakage will be in direct children, with odds decreasing significantly every generation after that. Personally, I'm willing to accept the risk of an Nth-level descendant breaking in return for much faster cycle times.

The caveat is you'd probably want this in addition to the ancestor side behavior of @ --- eg I want to build direct children and all ancestors of my model and its direct children.

@bobzoller bobzoller added enhancement New feature or request triage labels Jan 16, 2020
@drewbanin drewbanin added cli good_first_issue Straightforward + self-contained changes, good for new contributors! and removed triage labels Jan 21, 2020
@drewbanin
Copy link
Contributor

Hey @bobzoller - this is a cool idea - thanks for opening this issue. We had a similar open issue over here, but it's been stale for a pretty long time: #1548. I just closed that one in favor of this one.

I think a good design could be to support an integer before or after the + selector, eg:

dbt run --models 2+model_name+1

This integer would define the number of edges away from the specified model that dbt should traverse when selecting models to run. In this design, the number 1 would represent that the immediate parents or children of the model should be selected. This means that the following two selectors would become equivalent:

dbt run --models 0+model_name+0
dbt run --models model_name

You also noted:

The caveat is you'd probably want this in addition to the ancestor side behavior of @ --- eg I want to build direct children and all ancestors of my model and its direct children.

I don't think my proposed syntax would address this use case. It could be a good idea to implement a workflow like this using both --models and --exclude. That would require us to support a different type of selector that picks all of the models more than N edges away from the target node. I don't necessarily think of a good idea, but for the sake of discussion, imagine that we use a minus sign instead of a plus sign:

dbt run --models @my_model --exclude my_model-2

The --models flag would select my_model, as well as all of its children and their parents. The --exclude flag would select any model that is two or more hops downstream of my_model. Together, I think we'd get the behavior you're looking for.

I feel pretty good about the +{N} syntax, but I don't love the -{N} syntax described here. Do you have any strong feelings on this topic? And do you think the proposal here generally addresses your use case?

@gnilrets
Copy link

gnilrets commented Mar 5, 2020

Interesting…. I’m on the same page as you, Drew. I like model+N, but model-N seems a little counter-intuitive. model-N makes me think of N levels before model, but that is the same as 2+model.

Is it just the - symbol that seems strange? What about a ^?

dbt run --models @my_model --exclude my_model^2

So then my_model+1 selects the model and all the direct children of my_model. my_model^1 select only the children and all subsequent grandchildren of my_model.

@aroder
Copy link

aroder commented Mar 6, 2020

I think you could use the +N syntax even for exclude. The selection operation is the same, even though exclude removes what is selected--so I think the syntax should be the same

@drewbanin drewbanin added this to the Octavius Catto milestone Mar 18, 2020
@drewbanin
Copy link
Contributor

Cool, I just prioritized this for the 0.17.0 (Octavius Catto) release. Let's target the syntax:

[N]+[resource]+[N]

N can be an integer >= 0.

  • if N is zero, then no parents/children will be selected
  • if N is > zero, then parents/children N edges away from [resource] will be selected

Example usage:

# run all tests with a direct edge to the source
dbt test --models source:source.table+1

# run all of the immediate parents and children of a model
dbt run --models 1+model_name+1

# run parents of the model, and parents of those parents
dbt run --models 2+model_name

One interesting thing to consider: there might be multiple paths from a resource to its parents/children. Consider:

a -------> b -----> c
|                  ^
\-----------------/

Here, the selector 1+c should select both nodes a and b, as both are one parent-edge away from c. Cool issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants