Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dist style AUTO for redshift #1882

Closed
tjengel opened this issue Nov 1, 2019 · 7 comments
Closed

Support dist style AUTO for redshift #1882

tjengel opened this issue Nov 1, 2019 · 7 comments
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors! redshift

Comments

@tjengel
Copy link
Contributor

tjengel commented Nov 1, 2019

Describe the feature

Currently, it seems you can set the dist style to key, even, or all, but not auto for redshift. I'm proposing modifying https://github.com/fishtown-analytics/dbt/blob/fe48478993b299b66a337f84989d8d0f9ca3dcdc/plugins/redshift/dbt/include/redshift/macros/adapters.sql#L4 to include one more item in the list for 'auto'.

Additional context

  • Redshift

Who will this benefit?

Any dbt users on redshift who want to set tables to auto. Currently you can use all, but if a table gets to a certain size, it won't automatically change the distribution style to even. Auto will enable that.

@tjengel tjengel added enhancement New feature or request triage labels Nov 1, 2019
@tjengel tjengel changed the title Support dist style ALL for redshift Support dist style AUTO for redshift Nov 1, 2019
@drewbanin drewbanin added redshift good_first_issue Straightforward + self-contained changes, good for new contributors! and removed triage labels Nov 1, 2019
@drewbanin
Copy link
Contributor

drewbanin commented Nov 1, 2019

Thanks @tjengel! Let me know if there's anything I can do to help out if you're able to contribute a fix here

@drewbanin drewbanin added this to the Louisa May Alcott milestone Nov 4, 2019
@drewbanin
Copy link
Contributor

fixed in #1885

@rodrigodelmonte
Copy link
Contributor

rodrigodelmonte commented Mar 3, 2020

Hi @tjengel and @drewbanin

I’m trying to build a model in Redshift using dist='auto'.

Example:

{{ config(tags=['stg_creators_videos'],
          materialized='table',
          unique_key='creator_video_id',
          dist='auto',
          sort='occurred_at')
}}

I’ve followed the doc https://docs.getdbt.com/docs/redshift-configs where it says:

  • dist can have a setting of all, even, auto, or the name of a key.

But I'm getting this error:

Completed with 1 error and 0 warnings:
Database Error in model stg_creators_videos (models/marts/creators/intermediate/stg_creators_videos.sql)
  syntax error at or near "auto"
  LINE 6:     diststyle auto

Checking the AWS docs CREATE TABLE AS doesn’t support DISTSTYLE auto
https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_AS.html#r_CREATE_TABLE_AS-parameters

Am I missing something? Or dbt can't handle dist='auto' ?

@tjengel
Copy link
Contributor Author

tjengel commented Mar 3, 2020

Redshift does support auto diststyle https://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html, but you may have to specify it more specifically https://docs.aws.amazon.com/redshift/latest/dg/viewing-distribution-styles.html. Basically auto (all) meaning it's set to auto and is currently all, and then auto (even) where it's set to auto and is currently an even distribution.

@drewbanin
Copy link
Contributor

@tjengel hmm, I just ran this query against Redshift:

create table dbt_dbanin.debug_auto
diststyle auto
as (
	select 1 as id
);

And I definitely see:

ERROR:  syntax error at or near "auto"
LINE 2: diststyle auto
                  ^

The Redshift docs are super inconsistent and confusing here. Having just looked at this for 5 mins, it appears to me that the code added in #1885 is not correct, and dbt should just not specify a diststyle if dist='auto' is supplied as a model config.

Do you have some different understanding of how this works?

Thanks for raising this one @rodrigodelmonte!

@tjengel
Copy link
Contributor Author

tjengel commented Mar 24, 2020

I wanted to add it so it's clear in the code you're setting a table to be dist = all instead of leaving it implicit; however, as you stated, the docs are less than clear on how you specify that distribution style.

@drewbanin
Copy link
Contributor

I made an issue to track the removal of this DDL here: #2246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors! redshift
Projects
None yet
Development

No branches or pull requests

3 participants