Clear History

Set-up and run dbt package

This step assumes you have data in the ATOMIC.SAMPLE_EVENTS_MEDIA_PLAYER table which will be used to demonstrate how to set-up and run the snowplow_media_player dbt package to model Snowplow media player data.


Step 1: Override the dispatch order in your project

To take advantage of the optimized upsert that the Snowplow packages offer you need to ensure that certain macros are called from snowplow_utils first before dbt-core. This can be achieved by adding the following to the top level of your dbt_project.yml file:

# dbt_project.yml
...
dispatch:
  - macro_namespace: dbt
    search_order: ['snowplow_utils', 'dbt']

If you do not do this the package will still work, but the incremental upserts will become more costly over time.

Step 2: Adding the selectors.yml file

The snowplow_media_player package provides a suite of suggested selectors to run and test the models.

These are defined in the selectors.yml file within the package, however to use these model selections you will need to copy this file into your own dbt project directory.

This is a top-level file and therefore should sit alongside your dbt_project.yml file.

Step 3: Set-up Variables

The snowplow_media_player dbt package comes with a list of variables specified with a default value that you may need to overwrite in your own dbt project’s dbt_project.yml file. For details you can have a look at our docs which contains descriptions and default values of each variable, or you can look in the installed package’s project file which can be found at [dbt_project_name]/dbt_packages/snowplow_media_player/dbt_project.yml.

If you are using the provided sample data in ATOMIC.SAMPLE_EVENTS_MEDIA_PLAYER, add the following snippet to the dbt_project.yml:

vars:
  snowplow_media_player:
    snowplow__start_date: '2023-08-04'
    snowplow__events_table: SAMPLE_EVENTS_MEDIA_PLAYER
    snowplow__enable_media_ad: true
    snowplow__enable_media_ad_break: true
    snowplow__enable_ad_quartile_event: true

If you are using your own events, depending on which media plugin or tracking implementation you use, you will need to enable the relevant contexts in your dbt_project.yml:

vars:
  snowplow_media_player:
    # use the media session context schema (unless disabled on the tracker)
    snowplow__enable_media_session: true
    # depending whether you track ads, ad breaks and progress within ads:
    snowplow__enable_media_ad: true
    snowplow__enable_media_ad_break: true
    snowplow__enable_ad_quartile_event: true
vars:
  snowplow_media_player:
    snowplow__enable_media_player_v1: true
    snowplow__enable_media_player_v2: false
    snowplow__enable_media_session: false
    snowplow__enable_youtube: true
vars:
  snowplow_media_player:
    snowplow__enable_media_player_v1: true
    snowplow__enable_media_player_v2: false
    snowplow__enable_media_session: false
    snowplow__enable_whatwg_media: true
    snowplow__enable_whatwg_video: true
vars:
  snowplow_media_player:
    snowplow__enable_web_events: false
    snowplow__enable_mobile_events: true
    # use the media session context schema (unless disabled on the tracker)
    snowplow__enable_media_session: true
    # depending whether you track ads, ad breaks and progress within ads:
    snowplow__enable_media_ad: true
    snowplow__enable_media_ad_break: true
    snowplow__enable_ad_quartile_event: true

Check source data:

This package will by default assume your Snowplow events data is contained in the atomic schema of your target.database, in the table labeled events. In order to change this, please add the following to your dbt_project.yml file:

vars:
  snowplow_media_player:
    snowplow__atomic_schema: schema_with_snowplow_events
    snowplow__database: database_with_snowplow_events
    snowplow__events_table: table_of_snowplow_events

Filter your data set:

You can specify the start_date at which to start processing events, the app_id’s to filter for, and the event_name value to filter on. By default the start_date is set to 2020-01-01, all app_id’s are selected, and all events with the com.snowplowanalytics.snowplow.media or the media_player_event event name are being surfaced. To change this please add/modify the following in your dbt_project.yml file:

vars:
  snowplow_media_player:
    snowplow__start_date: 'yyyy-mm-dd'
    snowplow__app_id: ['my_app_1','my_app_2']
    snowplow__media_event_names: ['media_player_event', 'my_custom_media_event']

Step 4: Run the model

Execute the following either through your CLI or from within dbt Cloud

dbt run --selector snowplow_media_player

This should take a couple of minutes to run each time, depending on how many events you have per day.