Skip to main content

dbt: Mobile data model

The snowplow-mobile dbt package provides a means to run the standard mobile model via dbt. It processes events incrementally and is designed in a modular manner, allowing you to easily integrate your own custom SQL into the incremental framework provided by the package.

The package can be found in the snowplow/dbt-snowplow-mobile repo, with the full doc site hosted here.

Quickstart‚Äč

Requirements‚Äč

  • Snowplow¬†Android¬†or¬†iOS¬†mobile tracker version 1.1.0 or later implemented.
  • Mobile session context enabled.
  • Screen view events enabled.

Prerequisities‚Äč

  • dbt¬†must be installed.
  • A dataset of mobile events from the mobile trackers must be available in the database.

Supported Warehouses‚Äč

  • Redshift, Snowflake, BigQuery, and PostgreSQL

Installing the package‚Äč

Add the snowplow-mobile package to your packages.yml file. For more information, refer to dbt's package hub.

Essential Configuration‚Äč

1 - Check source data‚Äč

This package will by default assume your Snowplow events data is contained in the atomic schema of your target.database. In order to change this, please add the following to your dbt_project.yml file:

# dbt_project.yml
...
vars:
snowplow_mobile:
snowplow__atomic_schema: schema_with_snowplow_events
snowplow__database: database_with_snowplow_events

2 - Enable desired contexts‚Äč

The mobile package has the option to join in data from the following 4 Snowplow contexts:

  • Mobile context -- Device type, OS, etc.
  • Geolocation context -- Device latitude, longitude, bearing, etc.
  • Application context -- App version and build
  • Screen context -- Screen details associated with mobile event

By default, these are all disabled in the mobile package. Assuming you have the contexts turned on in your Snowplow pipeline, to enable the contexts within the package please add the following to your dbt_project.yml file:

# dbt_project.yml
...
vars:
snowplow__enable_mobile_context: true
snowplow__enable_geolocation_context: true
snowplow__enable_application_context: true
snowplow__enable_screen_context: true

3 - Enable optional modules‚Äč

The mobile package also has the option to join in data from the following 1 Snowplow module:

  • App Errors module -- Details relating to app errors that occur during sessions

By default this module is disabled in the mobile package. Assuming you have the module turned on in your Snowplow pipeline, to enable the module within the package please add the following to your dbt_project.yml file:

# dbt_project.yml
...
vars:
snowplow_mobile:
snowplow__enable_app_errors_module: true

4 - Filter your data set‚Äč

You can specify both the start_date at which to start processing events and the app_id's to filter for. By default, the start_date is set to 2020-01-01 and all app_id's are selected. To change this, please add the following to your dbt_project.yml file:

# dbt_project.yml
...
vars:
snowplow_mobile:
snowplow__start_date: 'yyyy-mm-dd'
snowplow__app_id: ['mobile_app_1', 'mobile_app_2']

Further configuration‚Äč

There are many additional configurations you can make to the model such as changing destination schemas, disabling modules, etc.

For more details, please refer to the snowplow-mobile package documentation.

Operation‚Äč

The Snowplow mobile model is designed to be run as a whole, which ensures all incremental tables are kept in sync. As such, we suggest running the model using:

dbt run --models snowplow_mobile tag:snowplow_mobile_incremental

We strongly advise reading the operation section on the full doc site for more information on operations such as full-refreshes of the mobile model and backfilling data, since operations such as full refreshes deviate slightly from the native dbt implementation.

Testing‚Äč

A full suite of tests are included within the package to ensure data quality. For more information on these and suggested implementations when running in orchestration, please refer to the doc site.

Customising the mobile model‚Äč

The package is designed in a modular manner, allowing you to easily add in your own custom SQL as well as incorporate data from events outside of just screen views. For more information on customising the model, please refer to the full doc site. A dummy example of a dbt project with customisations applied can also be found in the package repo.