Skip to content

Latest commit

 

History

History

dbt

dbt Connector

This connector extracts technical metadata from a dbt project by parsing the manifest.json & catalog.json files generated by dbt docs generate or dbt runcommand.

Setup

There is no special setup needed to run the connector as a CLI. However, we recommend running it as part of your dbt project's CI/CD workflow so that the metadata is refreshed automatically with each commit. Please refer to Metaphor dbt GitHub Action for more details.

The remaining sections are for those who intend to run the connector manually as a CLI or to integrate it into a different CI/CD environment.

Config File

Create a YAML config file based the following template.

Required Configurations

manifest: <path_to_manifest_json>
run_results: <path_to_run_results_json>

Optional Configurations

Output Destination

See Output Config for more information.

Snowflake Account

If the dbt project is using Snowflake, please provide the Snowflake account as follows,

account: <snowflake_account_name>

URLs

You can optionally provide the docs_base_url (base URL serving the dbt generated docs) and project_source_url (source code URL pointing to the project root directory). Those will help us to generate links to a model's docs and source code.

docs_base_url: <docs_base_url>
project_source_url: <project_dir_source_code_url>

Ownership

You can optionally specify the ownership for the materialized table or view using the meta config of a dbt model. For example:

models:
  - name: users
    config:
      meta:
        owner: joe@test.com

To map owner to the corresponding ownership type defined in Metaphor, add the following to the config file:

meta_ownerships:
  - meta_key: owner
    ownership_type: Data Steward

If the owner field contains only the user name (e.g. joe instead of joe@test.com), you can specify the common email domain using the email_domain config:

meta_ownerships:
  - meta_key: owner
    ownership_type: Data Steward
    email_domain: test.com

You can also choose between assigning the owner to the materialized table, the dbt model, or both:

meta_ownerships:
  - meta_key: owner
    ownership_type: Data Steward
    assignment_target: dbt_model # Valid choices: "dbt_model", "materialized_table", "both". Default is "both"

Governed Tags

Similar to Ownership, you can optionally specify certain attributes in meta. For example:

models:
  - name: dbt_model_name
    config:
      meta:
        pii: true

To map pii to the HAS_PII governed tag defined in Metaphor, add the following to the config file:

meta_tags:
  - meta_key: pii
    tag_type: HAS_PII

By default, only attributes with a value of true will be mapped. You can optionally specify a regex in meta_value_matcher to match other types of values. For example:

models:
  - name: dbt_model_name
    config:
      meta:
        team: sales

Use the following config to map it to the SALES tag on Metaphor:

meta_tags:
  - meta_key: team
    meta_value_matcher: sales
    tag_type: SALES

You can also tag a column by using its meta config:

models:
  - name: dbt_model_name
    columns:
      - name: column_name
        meta:
          pii: true

Testing

Follow the Installation instructions to install metaphor-connectors in your environment (or virtualenv). Make sure to include either all or dbt extra.

Run the following command to test the connector locally:

metaphor dbt <config_file>

Manually verify the output after the run finishes.