Normalize
Upgrading to 0.3.0
- Version 1.4.0 of
dbt-corenow required - You must add the following to the top level of your project yaml
dbt_project.yml
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt'] - Other changes required by snowplow-utils version 0.14.0
Upgrading to 0.2.0
- Version 1.3.0 of
dbt-corenow required - Upgrading your config file
- Change the
event_namefield toevent_namesand make the value a list - Change the
self_describing_event_schemafield toself_describing_event_schemasand make the value a list - If you wish to make use of the new features, see the example config or the docs for more information
- Change the
- Upgrading your models (preferred method is to re-run the python script, but can be done manually following these steps)
- For each normalized model:
- Convert the
event_nameandsde_colsfields to lists, and pluralize the names in both the set and the macro call - Add a new field,
sde_aliaseswhich is an empty list, add this betweensde_typesandcontext_colsin the macro call
- Convert the
- For your filtered events table:
- Change the
unique_keyin the config section tounique_id - Add a line between the
event_table_nameand from lines for each select statement;, event_id||'-'||'<that_event_table_name>' as unique_id, with the event table name for that select block.
- Change the
- For your users table:
- Add 3 new values to the start of the macro call,
'user_id','','', before theuser_colsargument.
- Add 3 new values to the start of the macro call,
- For each normalized model:
- Upgrade your filtered events table
- If you use the master filtered events table, you will need to add a new column for the latest version to work. If you have not processed much data yet it may be easier to simply re-run the package from scratch using dbt run --full-refresh --vars 'snowplow__allow_refresh: true', alternatively run the following in your warehouse, replacing the schema/dataset/warehouse and table name for your table:
ALTER TABLE {schema}.{table} ADD COLUMN unique_id STRING;
UPDATE {schema}.{table} SET unique_id = event_id||'-'||event_table_name WHERE 1 = 1;
- If you use the master filtered events table, you will need to add a new column for the latest version to work. If you have not processed much data yet it may be easier to simply re-run the package from scratch using dbt run --full-refresh --vars 'snowplow__allow_refresh: true', alternatively run the following in your warehouse, replacing the schema/dataset/warehouse and table name for your table: