Clear History

Welcome to the Composable CDP with Predictive ML Modeling accelerator. This accelerator guides you through building an advanced Composable Customer Data Platform using a selection of excellent tools for each step of the process.

  • Snowplow for creating user behavioral data from your product.
  • Databricks DeltaLake or Snowflake to store the data.
  • Databricks and MLFlow or Snowpark for training and executing sophisticated ML predictions to determine a likelihood of conversion.
  • Census activation platform to synchronize the audience segment with marketing tools (like Braze, Salesforce and Facebook Ads) and accelerate their conversion into qualified leads.

Once finished, you will be able to use predictive models to achieve a competitive advantage from customer behavior data on your website, driving higher return on ad spend.


Data loaders like Fivetran can play an important role in a composable CDP. In this accelerator we don’t go into detail on setting these up, see Fivetran’s Documentation if you want to learn more.

Who is this guide for?

  • Data scientists who would like to learn how Snowplow behavioral data can be used to build predictive ML models
  • Data practitioners who want to learn how to activate Snowplow behavioral data in third party tools

What you will learn

In approximately 1 working day (~6 working hours) you can achieve the following:

  • Build a predictive model - Build a machine learning model that can accurately predict conversion events using features collected from Snowplow’s out-of-the-box modelled data
  • Data activation - With Census connected to your rich user data, you can enable your marketing teams to effortlessly build new audiences and sync to their needed destinations
  • Next steps - Productionalize your ML model and visualize ad campaign performance synced from your Census audiences
gantt dateFormat HH-mm axisFormat %M section 1. Predict 2h :step1, 00-00, 2m section 2. Activate 1h :step2, after step1, 1m section 3. Next Steps 3h :step3, after step2, 3m


Complete our Advanced Analytics for Web accelerator if you don’t have any Snowplow modelled web data in your warehouse yet. You don’t need a working Snowplow pipeline, a sample events dataset is provided. You can also choose to skip this altogether and upload the 3 derived tables (snowplow_web_page_views, snowplow_web_sessions, snowplow_web_users) directly through these downloadable csvs. For instructions on how to do this, check out the official Databricks documentation or Snowflake documentation.

Predictive ML Modelling

  • Snowplow modelled web data (page views, sessions and users) stored in your data warehouse
  • Conversion events, these can be derived from a Snowplow tracked event or using other sources like Salesforce data.
  • Databricks or Snowflake account and a user with access to create schemas and tables

Data Activation

  • Snowplow modelled web data (page views, sessions and users) stored in your data warehouse
  • Set user_id in the tracker to your business user identifier (see docs) so that the user can be identified and connected to your Census destinations
  • Census account and a user with admin role

What you will build

A propensity to convert ML model that empowers you to intelligently market campaigns to website visitors that are most likely to convert.