` tag. If you wish, you can also override the title with a custom value: **JavaScript (tag):** ```javascript snowplow('trackPageView', { title: 'my custom page title' }); ``` **Browser (npm):** ```javascript import { trackPageView } from '@snowplow/browser-tracker'; trackPageView({ title: 'my custom page title' }); ``` *** ## Page view ID and `web_page` entity When the tracker loads on a page, or when `trackPageView()` is called, it generates a new page view UUID. This page view ID is attached to **all events** tracked on that page, as a [`web_page` entity](/docs/events/ootb-data/page-and-screen-view-events/#web-page-entity), until the next page view event is tracked. At that point, a new page view ID is generated and used for all subsequent events. From version 3, the `web_page` entity is [enabled by default](/docs/sources/web-trackers/tracker-setup/initialization-options/). We advise you leave this enabled so you can use the [Snowplow Unified Digital](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/) dbt package. By default, the tracker lazily generates the page view ID when it's needed for the first time. For example: 1. The tracker is initialized when the page loads 2. `trackPageView()` is called for the first time 3. Tracker calls `getPageViewId()` to get the current page view ID 4. Since there is no page view ID yet, it generates a new one 5. The page view event is tracked with the generated page view ID 6. Subsequent events use the same page view ID The next time `trackPageView()` is called, the tracker generates a new page view ID. If you track an event before the first page view, the tracker will generate the page view ID at that point: 1. The tracker is initialized when the page loads 2. A custom event is tracked 3. Tracker calls `getPageViewId()` to get the current page view ID 4. Since there is no page view ID yet, it generates a new one 5. The custom event is tracked with the generated page view ID 6. Subsequent events, including the first page view, use the same page view ID The second page view will generate a new page view ID as usual. You can get the current page view ID using [the `getPageViewId` function](/docs/sources/web-trackers/tracking-events/#page-view-id). ### Change ID behavior for SPAs In a single-page app (SPA), the tracker stays in memory as the user navigates between URLs. If you track events after navigating to a new URL but before calling `trackPageView()`, those events will use the existing page view ID from the previous page. The ID only resets when `trackPageView()` is called. Use the `preservePageViewIdForUrl` configuration option to bind the page view ID generation to URL changes instead of page view events. The options are: - `false` (default): URL changes don't trigger a new ID when reading `getPageViewId()`. Only `trackPageView()` triggers a new ID (on second+ call). - `true` or `'full'`: generate a new ID when reading `getPageViewId()` if the full URL changed. `trackPageView()` still generates a new ID on second+ call even for the same URL. - `'pathname'`: generate a new ID when reading `getPageViewId()` if pathname changed. Search params or fragment changes don't trigger a new ID. - `'pathnameAndSearch'`: generate a new ID when reading `getPageViewId()` if pathname or search params changed. Fragment changes don't trigger a new ID. - `preservePageViewId`: never regenerate the ID at all. Ignores `preservePageViewIdForUrl`. You can set these options at initialization or during runtime: **JavaScript (tag):** ```javascript // At initialization snowplow('newTracker', 'sp', 'collector.example.com', { appId: 'my-app', preservePageViewIdForUrl: 'pathname' }); // At runtime snowplow('preservePageViewIdForUrl', 'pathname'); ``` **Browser (npm):** ```javascript // At initialization import { newTracker, preservePageViewIdForUrl } from '@snowplow/browser-tracker'; const tracker = newTracker('sp', 'collector.example.com', { appId: 'my-app', preservePageViewIdForUrl: 'pathname' }); // At runtime tracker.preservePageViewIdForUrl('pathname'); ``` *** ## Reset page activity on page view By default, tracking a page view using `trackPageView()`resets [activity tracking](/docs/sources/web-trackers/tracking-events/activity-page-pings/). Read more about this in the [activity tracking documentation](/docs/sources/web-trackers/tracking-events/activity-page-pings/#reset-page-pings-on-page-view). ## Add entities dynamically As with all `trackX` methods, `trackPageView` can be passed an array of [custom entities](/docs/sources/web-trackers/custom-tracking-using-schemas/) as an additional parameter. Additionally, you can add entities to page view and [page ping](/docs/sources/web-trackers/tracking-events/activity-page-pings/) events dynamically using the `contextCallback` option. Pass it a function that returns an array of zero or more entities. The function will fire for the page view and for all subsequent [page pings](/docs/sources/web-trackers/tracking-events/activity-page-pings/) on the page. The returned entities will be added to the events. For example: **JavaScript (tag):** ```javascript // Turn on page pings every 10 seconds snowplow('enableActivityTracking', { minimumVisitLength: 10, heartbeatDelay: 10 }); snowplow('trackPageView', { // The usual array of static entities context: [{ schema: 'iglu:com.acme/static_context/jsonschema/1-0-0', data: { staticValue: new Date().toString() } }], // Function which returns an array of custom entities // Gets called once per page view / page ping contextCallback: function() { return [{ schema: 'iglu:com.acme/dynamic_context/jsonschema/1-0-0', data: { dynamicValue: new Date().toString() } }]; } }); ``` **Browser (npm):** ```javascript import { enableActivityTracking, trackPageView } from '@snowplow/browser-tracker'; // Turn on page pings every 10 seconds enableActivityTracking({ minimumVisitLength: 10, heartbeatDelay: 10 }); trackPageView({ // The usual array of static entities context: [{ schema: 'iglu:com.acme/static_context/jsonschema/1-0-0', data: { staticValue: new Date().toString() } }], // Function which returns an array of custom entities // Gets called once per page view / page ping contextCallback: function() { return [{ schema: 'iglu:com.acme/dynamic_context/jsonschema/1-0-0', data: { dynamicValue: new Date().toString() } }]; } }); ``` *** In this example, the tracked page view and every subsequent page ping will have both a `static_context` and a `dynamic_context` attached. The `static_context` will all have the same `staticValue`, but the `dynamic_context` will have different `dynamicValue` values. --- # Track Privacy Sandbox browser data with the web trackers > Capture Privacy Sandbox data including Topics API information for privacy-preserving interest-based advertising and content personalization. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/privacy-sandbox/ > **Warning:** Privacy Sandbox has been deprecated by Google. This plugin remains available but the underlying browser API may no longer be supported. The plugin allows for adding Privacy Sandbox [Topics API](https://developer.chrome.com/docs/privacy-sandbox/topics/overview/) data to your Snowplow tracking. To learn more about Privacy Sandbox visit the official [website](https://www.privacysandbox.com/). > **Note:** Some of the APIs and data will not be available by default in all users. This is commonly due to these APIs being dependent on browser support, user privacy preferences, browser feature-flags or ad-blocking software. The plugin will not modify or request access explicitly to any of these features if not available by default. > **Note:** The plugin is available since version 3.14 of the tracker. The Privacy Sandbox entity is **automatically tracked** once configured. ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-privacy-sandbox@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-privacy-sandbox@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-privacy-sandbox@latest/dist/index.umd.min.js", ["snowplowPrivacySandbox", "PrivacySandboxPlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-privacy-sandbox` - `yarn add @snowplow/browser-plugin-privacy-sandbox` - `pnpm add @snowplow/browser-plugin-privacy-sandbox` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { PrivacySandboxPlugin } from '@snowplow/browser-plugin-privacy-sandbox'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ PrivacySandboxPlugin() ], }); ``` *** ## Context entity Adding this plugin will automatically capture [this](https://github.com/snowplow/iglu-central/blob/master/schemas/com.google.privacysandbox/topics/jsonschema/1-0-0) entity. --- # Track screen views on web > Track screen views with mobile-style tracking on web including screen engagement metrics for time spent and scroll depth analysis. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/screen-views/ Screen view tracking is the recommended way to track users opening a screen in mobile apps. They are the default option for tracking views in our mobile trackers. Check out the [screen tracking overview page](/docs/events/ootb-data/page-and-screen-view-events/) for more details and schemas. On web, [we recommend using page views](/docs/sources/web-trackers/tracking-events/page-views/) to track users visiting a page. However, using the screen view tracking plugin is also an option on web. > **Note:** The plugin is available from Version 4.2 of the tracker. Screen tracking events must be **manually tracked**. The plugin will add the relevant entities automatically. ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-screen-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-screen-tracking@latest/dist/index.umd.min.js) (latest) | **Browser (npm):** - `npm install @snowplow/browser-plugin-screen-tracking` - `yarn add @snowplow/browser-plugin-screen-tracking` - `pnpm add @snowplow/browser-plugin-screen-tracking` *** In order to make use of the plugin, you will need to register it with the tracker: **JavaScript (tag):** ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-screen-tracking@latest/dist/index.umd.min.js', ['snowplowScreenTracking', 'ScreenTrackingPlugin'] ); ``` **Browser (npm):** ```javascript import { ScreenTrackingPlugin } from '@snowplow/browser-plugin-screen-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ ScreenTrackingPlugin() ], }); ``` *** ## Track a screen view event To track a [screen view event](/docs/events/ootb-data/page-and-screen-view-events/#screen-views), use the `trackScreenView` function. **JavaScript (tag):** This example shows the required properties only. ```javascript window.snowplow( 'trackScreenView', { name: 'my-screen-name', id: '5d79770b-015b-4af8-8c91-b2ed6faf4b1e', // generated automatically if not provided } ); ``` **Browser (npm):** This example shows the required properties only. ```javascript import { trackScreenView } from '@snowplow/browser-plugin-screen-tracking'; trackScreenView({ name: 'my-screen-name', id: '5d79770b-015b-4af8-8c91-b2ed6faf4b1e', // generated automatically if not provided }); ``` *** ## Screen entity By default, the tracker will automatically attach [a `screen` entity](/docs/events/ootb-data/page-and-screen-view-events/#screen-entity) to **all** events tracked. However, if you haven't tracked any screen views, no `screen` entity will be attached. The `screen` entity contains information about the last screen viewed by the user, based on the last `trackScreenView` call. **JavaScript (tag):** ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-screen-tracking@latest/dist/index.umd.min.js', ['snowplowScreenTracking', 'ScreenTrackingPlugin'], [ { screenContext: true, // enabled by default } ] ); ``` **Browser (npm):** ```javascript import { ScreenTrackingPlugin } from '@snowplow/browser-plugin-screen-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ ScreenTrackingPlugin({ screenContext: true, // enabled by default }) ], }); ``` *** ## Screen engagement tracking By default, the screen tracking plugin enables [screen engagement tracking](/docs/events/ootb-data/page-activity-tracking/#screen-engagement). The tracker will automatically track a screen end event, with `screen_summary` entity, just before tracking a new screen view event. Because this feature was designed for mobile platforms, not all the functionality is applicable to web. The page doesn't distinguish between foreground and background, so the `foreground_sec` time in the `screen_summary` entity will track the total time spent on the screen. The `background_sec` field will always be `null`. **JavaScript (tag):** ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-screen-tracking@latest/dist/index.umd.min.js', ['snowplowScreenTracking', 'ScreenTrackingPlugin'], [ { screenEngagementAutotracking: true, // enabled by default } ] ); ``` **Browser (npm):** ```javascript import { ScreenTrackingPlugin } from '@snowplow/browser-plugin-screen-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ ScreenTrackingPlugin({ screenEngagementAutotracking: true, // enabled by default }) ], }); ``` *** ### Updating list item view and scroll depth information To update the [list item viewed and scroll depth information](/docs/events/ootb-data/page-activity-tracking/#screen-engagement) tracked in the `screen_summary` entity, track the `ListItemView` and `ScrollChanged` events with this information. > **Note:** We designed the screen summary entity to work with mobile platforms. On web, you can track scroll offsets automatically using [page pings](/docs/sources/web-trackers/tracking-events/activity-page-pings/). > > For fine-grained tracking of page element visibility, consider using the [element visibility tracking plugin](/docs/sources/web-trackers/tracking-events/element-tracking/) instead. When tracked and `screenEngagementAutotracking` is enabled, the tracker won't send these events to the Collector, but will process the information into the next screen summary entity. This means that tracking these events by themselves has no effect if you don't also track screen views. If you've set `screenEngagementAutotracking: false`, the list item view and scroll depth events are treated as regular events and sent to the Collector. You may want to track the events every time a new list item is viewed on the screen, or whenever the scroll position changes. To update the list items viewed information: **JavaScript (tag):** ```javascript window.snowplow( 'trackListItemView', { index: 1, itemsCount: 10, } ); ``` **Browser (npm):** ```javascript import { trackListItemView } from '@snowplow/browser-plugin-screen-tracking'; trackListItemView({ index: 1, itemsCount: 10, }); ``` *** To update the scroll depth information: **JavaScript (tag):** ```javascript window.snowplow( 'trackScrollChanged', { yOffset: 10, xOffset: 20, viewHeight: 100, viewWidth: 200, contentHeight: 300, contentWidth: 400, } ); ``` **Browser (npm):** ```javascript import { trackScrollChanged } from '@snowplow/browser-plugin-screen-tracking'; trackScrollChanged({ yOffset: 10, xOffset: 20, viewHeight: 100, viewWidth: 200, contentHeight: 300, contentWidth: 400, }); ``` *** --- # Track sessions on web > Automatically track session information including session IDs, event indexes, and references to previous sessions with configurable session timeouts. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/session/ The web tracker can add a context entity to events with information about the current session. The context entity repeats some of the session information stored in canonical event properties (e.g. `domain_userid`, `domain_sessionid`), but also adds new information. It adds a reference to the previous session (`previousSessionId`) and first event in the current session (`firstEventId`, `firstEventTimestamp`). It also adds an index of the event in the session, useful for ordering events as they were tracked (`eventIndex`). > **Note:** Because session tracking requires the session cookie, no session context entities will be added to events if anonymous tracking is enabled. The default session expiry time is 30 minutes. ## Tracking the session entity The session entity is **automatically tracked** once [configured](/docs/sources/web-trackers/tracker-setup/initialization-options/). **Client session entity properties** The [client\_session](https://github.com/snowplow/iglu-central/tree/master/schemas/com.snowplowanalytics.snowplow/client_session/jsonschema/1-0-2/) context entity consists of the following properties: | Attribute | Description | Required? | | --------------------- | ------------------------------------------------------------------------------------------------------------- | --------- | | `userId` | An identifier for the user of the session (same as `domain_userid`). | Yes | | `sessionId` | An identifier (UUID) for the session (same as `domain_sessionid`). | Yes | | `sessionIndex` | The index of the current session for this user (same as `domain_sessionidx`). | Yes | | `eventIndex` | Optional index of the current event in the session. Signifies the order of events in which they were tracked. | No | | `previousSessionId` | The previous session identifier (UUID) for this user. | No | | `storageMechanism` | The mechanism that the session information has been stored on the device. | Yes | | `firstEventId` | The optional identifier (UUID) of the first event id for this session. | No | | `firstEventTimestamp` | Optional date-time timestamp of when the first event in the session was tracked. | No | > **Note:** Please note that the session context entity is only available since version 3.5 of the tracker. ## Starting a new session The tracker automatically tracks session information and implements session timeouts after which the session is reset. However, in case you want to trigger a new session manually (e.g., after logging out a user), you can use the `newSession` call to do that. This will expire the current session and start a new session. **JavaScript (tag):** ```javascript snowplow('newSession'); ``` **Browser (npm):** ```javascript import { newSession } from '@snowplow/browser-tracker'; newSession(); ``` *** ## On session update callback The `onSessionUpdateCallback` option, allows you to supply a callback function to be executed whenever a new session is generated on the tracker. The callback's signature is: `(clientSession: ClientSession) => void` where clientSession includes the same values as you would expect on the [`client_session`](http://iglucentral.com/schemas/com.snowplowanalytics.snowplow/client_session/jsonschema/1-0-2) context. _Note:_ The callback is **not** called whenever a session is expired, but only when a new one is generated. > **Note:** Please note that the session update callback is only available since version 3.11 of the tracker. ## Session cookie duration Whenever an event fires, the tracker creates a session cookie. If the cookie didn’t previously exist, the tracker interprets this as the start of a new session. By default the session cookie expires after 30 minutes. This means that a user leaving the site and returning in under 30 minutes does not change the session. You can override this default by setting `sessionCookieTimeout` to a duration (in seconds) in the initial tracker configuration object. For example, ```json { ... sessionCookieTimeout: 3600 ... } ``` would set the session cookie lifespan to an hour. --- # Track site search on web > Track internal site search queries with search terms, filters, and result counts to analyze user search behavior and content discovery. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/site-search/ Site search tracking are provided as part of the Site tracking plugin. This plugin also provides events for [social media interactions](/docs/sources/web-trackers/tracking-events/social-media/) and [timings](/docs/sources/web-trackers/tracking-events/timings/generic/). Site search events must be **manually tracked**. ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js", ["snowplowSiteTracking", "SiteTrackingPlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-site-tracking` - `yarn add @snowplow/browser-plugin-site-tracking` - `pnpm add @snowplow/browser-plugin-site-tracking` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { SiteTrackingPlugin, trackSiteSearch } from '@snowplow/browser-plugin-site-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ SiteTrackingPlugin() ], }); ``` *** ## Event Use the `trackSiteSearch` method to track users searching your website. Here are its arguments: | **Name** | **Required?** | **Description** | **Type** | | -------------- | ------------- | ------------------------------- | -------- | | `terms` | Yes | Search terms | array | | `filters` | No | Search filters | JSON | | `totalResults` | No | Results found | number | | `pageResults` | No | Results displayed on first page | number | An example: **JavaScript (tag):** ```javascript snowplow('trackSiteSearch', { terms: ['unified', 'log'], filters: {'category': 'books', 'sub-category': 'non-fiction'}, totalResults: 14, pageResults: 8 }); ``` **Browser (npm):** ```javascript import { trackSiteSearch } from '@snowplow/browser-plugin-site-tracking'; trackSiteSearch({ terms: ['unified', 'log'], filters: {'category': 'books', 'sub-category': 'non-fiction'}, totalResults: 14, pageResults: 8 }); ``` *** Site search events are implemented as Snowplow self-describing events. [Here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/site_search/jsonschema/1-0-0) is the schema for a `site_search` event. --- # Track social media interactions with the web trackers > Track user interactions with social media widgets including likes, shares, and retweets across Facebook, Twitter, and other platforms. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/social-media/ Social interaction tracking is provided as part of the Site tracking plugin. This plugin also provides events for [site search](/docs/sources/web-trackers/tracking-events/site-search/) and [timings](/docs/sources/web-trackers/tracking-events/timings/generic/). Social media interaction events must be **manually tracked**. ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js", ["snowplowSiteTracking", "SiteTrackingPlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-site-tracking` - `yarn add @snowplow/browser-plugin-site-tracking` - `pnpm add @snowplow/browser-plugin-site-tracking` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { SiteTrackingPlugin, trackSiteSearch } from '@snowplow/browser-plugin-site-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ SiteTrackingPlugin() ], }); ``` *** ## Event Social tracking is used to track the way users interact with Facebook, Twitter and Google + widgets, e.g. to capture "like this" or "tweet this" events. The `trackSocialInteraction` method takes three parameters: | **Parameter** | **Description** | **Required?** | **Example value** | | ------------- | ------------------------------------------------------------- | ------------- | --------------------- | | `action` | Social action performed | Yes | 'like', 'retweet' | | `network` | Social network | Yes | 'facebook', 'twitter' | | `target` | Object social action is performed on e.g. page ID, product ID | No | '19.99' | The method is executed in as: **JavaScript (tag):** ```javascript snowplow('trackSocialInteraction', { action: string, network: string, target: string }); ``` For example: ```javascript snowplow('trackSocialInteraction', { action: 'like', network: 'facebook', target: 'pbz00123' }); ``` **Browser (npm):** ```javascript import { trackSocialInteraction } from '@snowplow/browser-plugin-site-tracking'; trackSocialInteraction({ action: string, network: string, target: string }); ``` For example: ```javascript import { trackSocialInteraction } from '@snowplow/browser-plugin-site-tracking'; trackSocialInteraction({ action: 'like', network: 'facebook', target: 'pbz00123' }); ``` *** --- # Track timezone and geolocation on web > Capture user timezone information and geolocation coordinates with accuracy data through browser APIs for location-based analysis. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/timezone-geolocation/ Track users' timezone and geolocation with these configuration options. ## Timezone Since version 4 of the JavaScript tracker, the tracker automatically captures the user's timezone. It populates the `os_timezone` [atomic field](/docs/fundamentals/canonical-event/). This feature uses the `Intl.DateTimeFormat` function in modern browsers. If you're using an older version of the tracker, or your users use older browsers, use the timezone plugin to capture timezone information. It uses the `jstimezonedetect` library to determine the user's timezone and populate the `os_timezone` field. The timezone property is **automatically tracked**. ### Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-timezone@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-timezone@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-timezone@latest/dist/index.umd.min.js", ["snowplowTimezone", "TimezonePlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-timezone` - `yarn add @snowplow/browser-plugin-timezone` - `pnpm add @snowplow/browser-plugin-timezone` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { TimezonePlugin } from '@snowplow/browser-plugin-timezone'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ TimezonePlugin() ], }); ``` *** Once configured, all subsequent events will contain this property. ## Geolocation If this plugin is enabled, the tracker will attempt to create a entity from the visitor's geolocation information. If the visitor hasn't already given or denied the website permission to use their geolocation information, a prompt will appear. If they give permission, then all events from that moment on will include their geolocation information, as a context entity. If the geolocation entity isn't enabled at tracker initialization, you can enable it at a later time by calling `enableGeolocationContext`. This is useful if you have other areas of your site where you require requesting geolocation access, as you can defer enabling this on your Snowplow events until you have permission to read the users geolocation for your other use case. For more information on the geolocation API, see [the specification](http://dev.w3.org/geo/api/spec-source.html). Check out the [geolocation tracking overview](/docs/events/ootb-data/geolocation/) for the entity schema. The geolocation entity is **automatically tracked** once configured. ### Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | This plugin was included in `sp.js` in version 3, but removed from the default bundle in version 4. **Download:** | | | | ------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDeliv](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-geolocation@latest/dist/index.umd.min.js)[r](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-consent@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-geolocation@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. To enable after initialization: ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-geolocation@latest/dist/index.umd.min.js", ["snowplowGeolocation", "GeolocationPlugin"], ); // Enable when appropriate e.g. after user consents // This prompts the browser for location permission window.snowplow('enableGeolocationContext'); ``` To enable at initialization: ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-geolocation@latest/dist/index.umd.min.js", ["snowplowGeolocation", "GeolocationPlugin"], [true] // Prompts for permission immediately ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-geolocation` - `yarn add @snowplow/browser-plugin-geolocation` - `pnpm add @snowplow/browser-plugin-geolocation` To enable after initialization: ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { GeolocationPlugin, enableGeolocationContext } from '@snowplow/browser-plugin-geolocation'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ GeolocationPlugin() ], // Inactive by default }); // Enable when appropriate e.g. after user consents // This prompts the browser for location permission enableGeolocationContext(); ``` To enable at initialization: ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { GeolocationPlugin } from '@snowplow/browser-plugin-geolocation'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ GeolocationPlugin(true) ], // Prompts for permission immediately }); ``` *** > **Note:** There's no API to turn off geolocation tracking once enabled. If you have multiple tracker instances on your site, you can choose to enable geolocation tracking on a per-tracker basis. Provide a list of tracker namespaces to the `enableGeolocationContext` function. **JavaScript (tag):** ```javascript window.snowplow('enableGeolocationContext', ['sp1']); ``` **Browser (npm):** ```javascript enableGeolocationContext(['sp1']); ``` *** --- # Track generic site timings on web > Track custom timing events to measure how long resources take to load or user actions take to complete with category and label organization. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/timings/generic/ Use the `trackTiming` method to track user timing events such as how long resources take to load. This method is provided as part of the `site-tracking` plugin. Timing events must be **manually tracked**. ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-site-tracking@latest/dist/index.umd.min.js", ["snowplowSiteTracking", "SiteTrackingPlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-site-tracking` - `yarn add @snowplow/browser-plugin-site-tracking` - `pnpm add @snowplow/browser-plugin-site-tracking` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { SiteTrackingPlugin, trackSiteSearch } from '@snowplow/browser-plugin-site-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ SiteTrackingPlugin() ], }); ``` *** ## Timing event Use the `trackTiming` method to track how long something took. Here are its arguments: | **Name** | **Required?** | **Description** | **Type** | | ---------- | ------------- | ------------------------------ | -------- | | `category` | Yes | Timing category | string | | `variable` | Yes | Timed variable | string | | `timing` | Yes | Number of milliseconds elapsed | number | | `label` | No | Label for the event | string | An example: **JavaScript (tag):** ```javascript snowplow('trackTiming', { category: 'load', variable: 'map_loaded', timing: 50, label: 'Map loading time' }); ``` **Browser (npm):** ```javascript import { trackTiming } from '@snowplow/browser-plugin-site-tracking'; trackTiming({ category: 'load', variable: 'map_loaded', timing: 50, label: 'Map loading time' }); ``` *** Timing events are implemented as Snowplow self describing events. [Here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/timing/jsonschema/1-0-0) is the schema for a `timing` event. --- # Track performance navigation timings on web > Track page load performance metrics using the PerformanceNavigationTiming API including DNS lookup, connection times, and resource loading duration. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/timings/ To add performance timing context entities to your Snowplow tracking, use this Performance navigation timing plugin. By default all its metrics are relative to the page load rather than absolute time stamps, making it easy to analyze and aggregate. To learn more about the properties tracked, you can visit the [specification](https://www.w3.org/TR/navigation-timing-2/) or MDN [documentation site](https://developer.mozilla.org/en-US/docs/Web/API/PerformanceNavigationTiming). The following diagram shows the ResourceTiming and PerformanceNavigationTiming properties and how they connect to the navigation of the page main document. ![performance navigation timeline](/assets/images/performance_navigation_timeline-2abc2f691ceff27976a60f6e68678be2.png) _Performance navigation timeline from the [W3C specification](https://www.w3.org/TR/navigation-timing-2/)._ > **Note:** The plugin is available since version 3.10 of the tracker. Adding this plugin will automatically capture [this](https://github.com/snowplow/iglu-central/blob/master/schemas/org.w3/PerformanceNavigationTiming/jsonschema/1-0-0) context entity. Performance navigation timing context entities are **automatically tracked** once configured. You can also create and track general timing events using the [site tracking plugin](/docs/sources/web-trackers/tracking-events/timings/generic/). ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-performance-navigation-timing@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-performance-navigation-timing@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-performance-navigation-timing@latest/dist/index.umd.min.js", ["snowplowPerformanceNavigationTiming", "PerformanceNavigationTimingPlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-performance-navigation-timing` - `yarn add @snowplow/browser-plugin-performance-navigation-timing` - `pnpm add @snowplow/browser-plugin-performance-navigation-timing` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { PerformanceNavigationTimingPlugin } from '@snowplow/browser-plugin-performance-navigation-timing'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ PerformanceNavigationTimingPlugin() ], }); ``` *** ## More detailed analysis for Single Page Applications (SPA) As these metrics are primarily related to the initial page serve and load, after the window\.onload handler ends the metrics will likely stay static for the life of the SPA page. If there's a pattern to the API requests the SPA makes for new content, the [PerformanceObserver](https://developer.mozilla.org/en-US/docs/Web/API/PerformanceObserver) could be used to capture the network metrics for that request from the performance API and re-use the schema. For actual rendering performance, the application will have to use the PerformanceMark/PerformanceMeasure [User timing](https://developer.mozilla.org/en-US/docs/Web/API/Performance_API/User_timing) APIs. These allow custom timing milestones, but since it's completely custom and there are no common conventions for using these APIs, currently there is no automatic support in the tracker for tracking these. Therefore we encourage you to build your custom schema in case you believe you would benefit from these additional metrics. ## Performance timing plugin (original) > **Warning:** This plugin has been deprecated and superseded by the Performance Navigation Timing plugin described above. This older plugin has been superseded by the Performance Navigation Timing plugin, which has a newer API and additional metrics such as the compressed/decompressed page size, and information about the navigation that can contextualise cache usage that can impact the measured metrics, as well as server-side metrics, etc. If this context entity is enabled, the JavaScript Tracker will use the create a context JSON from the `window.performance.timing` object, along with the Chrome `firstPaintTime` field (renamed to `chromeFirstPaint`) if it exists. This data can be used to calculate page performance metrics. Note that if you fire a page view event as soon as the page loads, the `domComplete`, `loadEventStart`, `loadEventEnd`, and `chromeFirstPaint` metrics in the Navigation Timing API may be set to zero. This is because those properties are only known once all scripts on the page have finished executing. **Details** Advanced PerformanceTiming usage The `domComplete`, `loadEventStart`, and `loadEventEnd` metrics in the NavigationTiming API are set to 0 until after every script on the page has finished executing, including sp.js. This means that the corresponding fields in the PerformanceTiming reported by the tracker will be 0. To get around this limitation, you can wrap all Snowplow code in a `setTimeout` call: ```javascript setTimeout(function () { // Load Snowplow and call tracking methods here }, 0); ``` This delays its execution until after those NavigationTiming fields are set. Additionally the `redirectStart`, `redirectEnd`, and `secureConnectionStart` are set to 0 if there is no redirect or a secure connection is not requested. For more information on the Navigation Timing API, see [the specification](http://www.w3.org/TR/2012/REC-navigation-timing-20121217/#sec-window.performance-attribute). Performance timing context entities are **automatically tracked** once configured. The schema is [here](https://github.com/snowplow/iglu-central/blob/master/schemas/org.w3/PerformanceTiming/jsonschema/1-0-0). ### Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | This plugin was included in `sp.js` in version 3, but removed in version 4 in favor of the performance navigation timing plugin. **Download:** | | | | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-performance-timing@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-performance-timing@latest/dist/index.umd.min.js) (latest) | **Note:** The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-performance-timing@latest/dist/index.umd.min.js", ["snowplowPerformanceTiming", "PerformanceTimingPlugin"] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-performance-timing` - `yarn add @snowplow/browser-plugin-performance-timing` - `pnpm add @snowplow/browser-plugin-performance-timing` ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { PerformanceTimingPlugin } from '@snowplow/browser-plugin-performance-timing'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ PerformanceTimingPlugin() ], }); ``` *** --- # Track core web vitals performance metrics with the web trackers > Automatically track Core Web Vitals metrics including LCP, FID, and CLS using the web-vitals library for user experience analysis. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/web-vitals/ The plugin adds the capability to track web performance metrics categorized as [Web Vitals](https://web.dev/vitals/). These metrics are tracked with an event based on the [web\_vitals schema](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/web_vitals/jsonschema/). To make sure it collects the most complete values for the web vital metrics, the plugin uses a set of browser APIs to detect and send the event as the visitor is leaving the current page in the browser. To collect the web vitals data, the plugin loads the [web-vitals](https://github.com/GoogleChrome/web-vitals) open source library dynamically on your page. > **Note:** The plugin is available since version 3.13 of the tracker. Web vitals events are **automatically tracked** once configured. ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-web-vitals@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-web-vitals@latest/dist/index.umd.min.js) (latest) | ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-web-vitals@latest/dist/index.umd.min.js', ['snowplowWebVitals', 'WebVitalsPlugin'] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-web-vitals` - `yarn add @snowplow/browser-plugin-web-vitals` - `pnpm add @snowplow/browser-plugin-web-vitals` ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { WebVitalsPlugin } from '@snowplow/browser-plugin-web-vitals'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ WebVitalsPlugin() ], }); ``` *** ## Configuration The Web Vitals plugin can be initialized with a couple of options allowing for customizing its behavior: | Option | Type | Description | Default value | | --------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------- | | `loadWebVitalsScript` | `boolean` | Should the plugin immediately load the Core Web Vitals measurement script from UNPKG CDN. | `true` | | `webVitalsSource` | `string` | The URL endpoint the Web Vitals script should be loaded from. Defaults to the UNPKG CDN. | `https://unpkg.com/web-vitals@4/dist/web-vitals.iife.js` | | `context` | `DynamicContext` | Context entities to add to the tracked event. Can be provided either as an array of self-describing JSONs or function that returns a context entity. (available from version 3.19) | | **JavaScript (tag):** Pass configuration options as an array in the fourth parameter of `addPlugin`: ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-web-vitals@latest/dist/index.umd.min.js', ['snowplowWebVitals', 'WebVitalsPlugin'], [{ loadWebVitalsScript: true, webVitalsSource: 'https://cdn.jsdelivr.net/npm/web-vitals@3/dist/web-vitals.iife.js', context: [ function(vitals) { return { schema: 'iglu:com.example/page_performance/jsonschema/1-0-0', data: { lcpRating: vitals.lcp < 2500 ? 'good' : 'needs-improvement' } }; } ] }] ); ``` **Browser (npm):** Pass configuration options directly to the plugin function: ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { WebVitalsPlugin } from '@snowplow/browser-plugin-web-vitals'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ WebVitalsPlugin({ loadWebVitalsScript: true, webVitalsSource: 'https://cdn.jsdelivr.net/npm/web-vitals@3/dist/web-vitals.iife.js', context: [ function(vitals) { return { schema: 'iglu:com.example/page_performance/jsonschema/1-0-0', data: { lcpRating: vitals.lcp < 2500 ? 'good' : 'needs-improvement' } }; } ] }) ], }); ``` *** ### Using an already existing Web Vitals library source There could be cases where your page or one of the loaded JavaScript bundles already include the [web-vitals](https://github.com/GoogleChrome/web-vitals) library. In those cases, there is no need to load it an additional time from the plugin. If this is the case, you have to make sure that the library APIs are exposed in the `window` object properly as well. ### Choosing a Web Vitals measurement source The default Web Vitals measurement script is loaded from the [UNPKG](https://www.unpkg.com/) CDN. This choice is chosen as a default but you should consider your own setup when choosing the script source. Selecting a script source from a CDN which might already be used in your website might save you from yet another connection startup time (_Queueing_,_DNS lookup_,_TCP_, _SSL_). Another reasonable choice could be [jsDelivr](https://cdn.jsdelivr.net/npm/web-vitals@3/dist/web-vitals.iife.js). --- # Track WebViews and hybrid applications using the web trackers > Integrate web tracking in hybrid mobile apps by forwarding events from web views to native iOS, Android, or React Native trackers. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/webview/ This plugin provides integration for hybrid apps using the Snowplow native mobile ([Android and iOS v6.1+](/docs/sources/mobile-trackers/hybrid-apps/)) or [React Native v4.2+](/docs/sources/react-native-tracker/hybrid-apps/) trackers. Hybrid apps are mobile apps that in addition to a native interface, provide part of the UI through an embedded web view. If your web app will run both separately and as part of a hybrid app, with this plugin you only need one tracking implementation for your web app. You still have to implement the mobile trackers too. When the plugin is active, for every event the web tracker checks if it's running in a web view with at least one of the mobile interfaces available. If it is, the web tracker forwards the event to the mobile tracker, and doesn't track it itself. If not, the web tracker tracks the event as normal. > **Note:** The plugin is available since version 4.3 of the tracker. The WebView integration is **automatic** once configured. The diagram below shows the interaction of the WebView plugin and mobile trackers in hybrid apps. ```mermaid flowchart TB subgraph hybridApp[Hybrid Mobile App] subgraph webView[Web View] webViewCode[App logic] webViewTracker[JS tracker with WebView plugin] webViewCode -- "Tracks events" --> webViewTracker end subgraph nativeCode[Native Code] nativeAppCode[App logic] nativeTracker[Snowplow iOS/Android/React Native tracker] nativeAppCode -- "Tracks events" --> nativeTracker end webViewTracker -- "Forwards events" --> nativeTracker end subgraph cloud[Cloud] collector[Snowplow Collector] end nativeTracker -- "Sends tracked events" --> collector ``` To use the WebView plugin, you must have a mobile tracker initialized and configured. The supported trackers are: - Android v6.1+ - iOS v6.1+ - React Native v4.2+ How to set up hybrid app tracking: 1. Implement the Snowplow [iOS, Android](/docs/sources/mobile-trackers/), or [React Native](/docs/sources/react-native-tracker/) tracker in your mobile codebase. 2. Create a web view based on your web app, with Snowplow web tracking and WebView plugin instrumented. 3. Subscribe to the web view. Read how to do this for [native mobile](/docs/sources/mobile-trackers/hybrid-apps/) or [React Native](/docs/sources/react-native-tracker/hybrid-apps/). 4. Track events from web and mobile. This plugin uses the [Snowplow WebView tracker](/docs/sources/webview-tracker/) as a dependency. ## What do the forwarded events look like? The forwarded hybrid events will have all the information tracked by the web tracker. This includes all entities, whether configured by the tracker [automatically](/docs/sources/web-trackers/tracking-events/#add-contextual-data-with-entities) or by you as a global context. Baked-in (non-entity) properties such as user agent or URL are also included. Additionally, any configured mobile entities will also be added. Again, this includes [auto-tracked entities](/docs/sources/mobile-trackers/tracking-events/#auto-tracked-events-and-entities) such as the screen, session, or platform entities, as well as any global context entities. The forwarded events will have the web tracker version, e.g. "js-4.3.0", but the `namespace` and `appId` from the mobile tracker. Hybrid events are still compatible with the Snowplow [Unified Digital dbt model](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/). ## Install plugin **JavaScript (tag):** | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | **Download:** | | | | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [GitHub Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-webview@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-webview@latest/dist/index.umd.min.js) (latest) | ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-webview@latest/dist/index.umd.min.js', ['snowplowWebViewTracking', 'WebViewPlugin'] ); ``` **Browser (npm):** - `npm install @snowplow/browser-plugin-webview` - `yarn add @snowplow/browser-plugin-webview` - `pnpm add @snowplow/browser-plugin-webview` ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { WebViewPlugin } from '@snowplow/browser-plugin-webview'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ WebViewPlugin() ], }); ``` *** ## Configuration By default, the plugin will forward events to the default initialized Snowplow tracker on each platform. To specify a different tracker instance, or multiple trackers, pass in a list of tracker namespaces at setup. **JavaScript (tag):** ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-webview@latest/dist/index.umd.min.js', ["snowplowWebViewTracking", "WebViewPlugin"], [ { trackerNamespaces: ["sp1", "sp2"], } ] ); ``` **Browser (npm):** ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { WebViewPlugin } from '@snowplow/browser-plugin-webview'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ WebViewPlugin({ trackerNamespaces: ['sp1', 'sp2'] }) ], }); ``` *** > **Warning:** When a mobile interface is available, the web tracker will forward the event rather than tracking it. If you specify namespaces in the configuration, and no mobile trackers actually exist with those namespaces, the event will be lost. --- # Adjust webhook > Track mobile attribution data including app installation and reattribution events from Adjust callbacks using the Iglu webhook adapter. > Source: https://docs.snowplow.io/docs/sources/webhooks/adjust-webhook/ [Adjust](https://www.adjust.com/) provides a mobile attribution platform. It enables users to track what marketing channels drive mobile app installations. Many Snowplow users who have mobile apps use Adjust to capture the sources (marketing channels) that drive app downloads. By integrating Adjust with Snowplow, it is straightforward to: 1. Have your mobile attribution data (at the app installation event-level), with the rest of your event-level data, in Snowplow 2. Join app installation data collected from Adjust, with other user event data collected through Snowplow from mobile, web or other platforms, in order to analyze the performance of those marketing channels and return on ad spend. This is typically used to drive future ad spend. Integrating Adjust with Snowplow is straightforward. In this guide we will walk through the process. ## How the integration works: an overview The integration uses the [Adjust Callback](https://docs.adjust.com/en/callbacks/). We will configure this to send an app install event to the Snowplow collector when an app is installed, with the attribution data attached. In addition, we will set this up to send a reattribution event every time this occurs in Adjust to Snowplow. Again, this will contain all the data we need to figure out what drove an earlier app installation. On the Snowplow side, we will use the [Iglu webhook adapter](/docs/sources/webhooks/iglu-webhook/) to ensure that the event is correctly received and processed. We will create a schema for the event that matches the structure of the data sent from Adjust. The guide is illustrative. One of the nice things about Adjust is that it gives you a huge amount of flexibility about what data you want to send into Snowplow and how. One of the great things about Snowplow is that it is flexible enough to work with very many structures of data, just so long as it knows the schema that the data adheres to. So there are lots of ways you can adapt the following setup. It should be a good start for most users however. ## Step-by-step guide ### 1. Setup the Adjust webhook Log into Adjust and navigate to Settings -> Events screen in the dashboard. Hover over the install event, click the “edit” icon that appears and enter the following, substituting your own collector URL for `mycollector.mydomain.com`: ```markup http://mycollector.mydomain.com/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.adjust%2Finstall%2Fjsonschema%2F1-0-0&app_id={app_id}&app_name={app_name}&app_name_dashboard={app_name_dashboard}&store={store}&tracker={tracker}&tracker_name={tracker_name}&network_name={network_name}&campaign_name={campaign_name}&adgroup_name={adgroup_name}&creative_name={creative_name}&impression_based={impression_based}&is_organic={is_organic}&gclid={gclid}&rejection_reason={rejection_reason}&click_referer={click_referer}&click_attribution_window={click_attribution_window}&impression_attribution_window={impression_attribution_window}&reattribution_attribution_window={reattribution_attribution_window}&inactive_user_definition={inactive_user_definition}&adid={adid}&idfa={idfa}&android_id={android_id}&android_id_md5={android_id_md5}&mac_sha1={mac_sha1}&mac_md5={mac_md5}&idfa-android-id={idfa||android_id}&idfa-or-gps-adid={idfa||gps_adid}&idfa_md5={idfa_md5}&idfa_md5_hex={idfa_md5_hex}&idfv={idfv}&gps_adid={gps_adid}&gps_adid_md5={gps_adid_md5}&win_udid={win_udid}&win_hwid={win_hwid}&win_naid={win_naid}&win_adid={win_adid}&match_type={match_type}&reftag={reftag}&referrer={referrer}&user_agent={user_agent}&ip_address={ip_address}&click_time={click_time}&engagement_time={engagement_time}&installed_at={installed_at}&installed_at_hour={installed_at_hour}&created_at={created_at}&reattributed_at={reattributed_at}&connection_type={connection_type}&isp={isp}&city={city}&country={country}&language={language}&device_name={device_name}&device_type={device_type}&os_name={os_name}&api_level={api_level}&sdk_version={sdk_version}&os_version={os_version}&environment={environment}&tracking_enabled={tracking_enabled}&timezone={timezone}&fb_campaign_group_name={fb_campaign_group_name}&fb_campaign_group_id={fb_campaign_group_id}&fb_campaign_name={fb_campaign_name}&fb_campaign_id={fb_campaign_id}&fb_adgroup_name={fb_adgroup_name}&fb_adgroup_id={fb_adgroup_id}&tweet_id={tweet_id}&twitter_line_item_id={twitter_line_item_id}&label={label} ``` We are collecting the data by adding key-value pairs to request in the format `data_point={data_point}`. The first part of each pair is the name of the field in the collected event and it must match a property name in your schema. The second part, in curly braces, is a placeholder that Adjust replaces with an actual value before making the callback. For example, `app_id={app_id}` will be transformed to something like `app_id=C013FJP3WF`. In our example webhook setup above we’re using the schema that is defined further down in this post, but you are a free to use any custom compatible schema. There is no requirement that the key in each key-value pair must match the spelling of the respective Adjust placeholder, so your schema can have properties with names that better match your own business logic, e.g. `adjust_tracker={tracker}`. We have elected to grab _all_ the data that Adjust makes available with an install event (at the time of writing). For a complete list of data Adjust can send, see the [Adjust placeholder documentation](https://partners.adjust.com/placeholders/). At the end of this post we’ll describe how you can tailor your setup to grab just a subset. ### 2. Create the corresponding jsonschema In order for Snowplow to process the data sent to the Iglu webhook, we need to schema it. _Since first publishing this guide, we’ve added an Adjust `install` event schema to our [Iglu Central](https://github.com/snowplow/iglu-central) schema repository. You can find it [here](https://github.com/snowplow/iglu-central/blob/master/schemas/com.adjust/install/jsonschema/1-0-0). We recommend using that schema for out-of-the-box tracking of all the data points associated with an `install` event (at the time the schema was written), rather than the example schema used further down. If you’re using Redshift you might also find the [table definition](https://github.com/snowplow/iglu-central/blob/master/sql/com.adjust/install_1.sql) and [JSONpaths file](https://github.com/snowplow/iglu-central/blob/master/jsonpaths/com.adjust/install_1.json) from Iglu Central useful. For Snowflake and BigQuery users, the loader app in the pipeline will figure those out itself._ _Refer to the rest of this guide to see how you can write your own custom schemas._ Upload the following schema to your Iglu repo as `com.adjust.snowplow/install/jsonschema/1-0-0`: ```json { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "description": "Schema for Adjust install event", "self": { "vendor": "com.adjust.snowplow", "name": "install", "format": "jsonschema", "version": "1-0-0" }, "type": "object", "properties": { "app_id": { "type": "string" }, "app_name": { "type": "string", "maxLength": 256 }, "app_name_dashboard": { "type": "string", "maxLength": 256 }, "store": { "type": "string", "maxLength": 256 }, "tracker": { "type": "string", "maxLength": 256 }, "tracker_name": { "type": "string", "maxLength": 256 }, "network_name": { "type": "string", "maxLength": 256 }, "campaign_name": { "type": "string", "maxLength": 256 }, "adgroup_name": { "type": "string", "maxLength": 256 }, "creative_name": { "type": "string", "maxLength": 256 }, "impression_based": { "type": "string", "maxLength": 256 }, "is_organic": { "type": "string", "maxLength": 256 }, "gclid": { "type": "string", "maxLength": 256 }, "rejection_reason": { "type": "string", "maxLength": 256 }, "click_referer": { "type": "string", "maxLength": 256 }, "click_attribution_window": { "type": "string", "maxLength": 256 }, "impression_attribution_window": { "type": "string", "maxLength": 256 }, "reattribution_attribution_window": { "type": "string", "maxLength": 256 }, "inactive_user_definition": { "type": "string", "maxLength": 256 }, "adid": { "type": "string", "maxLength": 256 }, "idfa": { "type": "string", "maxLength": 256 }, "android_id": { "type": "string", "maxLength": 256 }, "android_id_md5": { "type": "string", "maxLength": 256 }, "mac_sha1": { "type": "string", "maxLength": 256 }, "mac_md5": { "type": "string", "maxLength": 256 }, "idfa-android-id": { "type": "string", "maxLength": 256 }, "idfa-or-gps-adid": { "type": "string", "maxLength": 256 }, "idfa_md5": { "type": "string", "maxLength": 256 }, "idfa_md5_hex": { "type": "string", "maxLength": 256 }, "idfv": { "type": "string", "maxLength": 256 }, "gps_adid": { "type": "string", "maxLength": 256 }, "gps_adid_md5": { "type": "string", "maxLength": 256 }, "win_udid": { "type": "string", "maxLength": 256 }, "win_hwid": { "type": "string", "maxLength": 256 }, "win_naid": { "type": "string", "maxLength": 256 }, "win_adid": { "type": "string", "maxLength": 256 }, "match_type": { "type": "string", "maxLength": 256 }, "reftag": { "type": "string", "maxLength": 256 }, "referrer": { "type": "string", "maxLength": 256 }, "user_agent": { "type": "string", "maxLength": 256 }, "ip_address": { "type": "string", "maxLength": 256 }, "click_time": { "type": "string", "maxLength": 256 }, "engagement_time": { "type": "string", "maxLength": 256 }, "installed_at": { "type": "string", "maxLength": 256 }, "installed_at_hour": { "type": "string", "maxLength": 256 }, "created_at": { "type": "string", "maxLength": 256 }, "reattributed_at": { "type": "string", "maxLength": 256 }, "connection_type": { "type": "string", "maxLength": 256 }, "isp": { "type": "string", "maxLength": 256 }, "city": { "type": "string", "maxLength": 256 }, "country": { "type": "string", "maxLength": 256 }, "language": { "type": "string", "maxLength": 256 }, "device_name": { "type": "string", "maxLength": 256 }, "device_type": { "type": "string", "maxLength": 256 }, "os_name": { "type": "string", "maxLength": 256 }, "api_level": { "type": "string", "maxLength": 256 }, "sdk_version": { "type": "string", "maxLength": 256 }, "os_version": { "type": "string", "maxLength": 256 }, "environment": { "type": "string", "maxLength": 256 }, "tracking_enabled": { "type": "string", "maxLength": 256 }, "timezone": { "type": "string", "maxLength": 256 }, "fb_campaign_group_name": { "type": "string", "maxLength": 256 }, "fb_campaign_group_id": { "type": "string", "maxLength": 256 }, "fb_campaign_name": { "type": "string", "maxLength": 256 }, "fb_campaign_id": { "type": "string", "maxLength": 256 }, "fb_adgroup_name": { "type": "string", "maxLength": 256 }, "fb_adgroup_id": { "type": "string", "maxLength": 256 }, "tweet_id": { "type": "string", "maxLength": 256 }, "twitter_line_item_id": { "type": "string", "maxLength": 256 }, "label": { "type": "string", "maxLength": 256 } }, "additionalProperties": true } ``` We have included a field for each placeholder that we included in our Adjust callback. Although here we’ve used the same spelling for property names as the placeholders, there is no requirement to do so in your own schema. ### 3. Create the table in Redshift Create a corresponding Redshift table for the schema. We recommend autogenerating this [Schema Guru](https://github.com/snowplow/schema-guru), e.g. by executing the following in the root of your schema registry: ```bash /path/to/schema-guru-0.6.2 ddl --with-json-paths schemas/com.adjust.snowplow/install ``` Or with [Igluctl](/docs/api-reference/iglu/igluctl-2/): ```bash /path/to/igluctl static generate --with-json-paths schemas/com.adjust.snowplow/install ``` In our case this auto-generated the following table definition. (_Please note that this example might be outdated. Since first publishing this guide, Redshift has introduced the more efficient ZSTD encoding for column compression, which we have adopted as standard in Igluctl._) ```sql CREATE TABLE IF NOT EXISTS atomic.com_adjust_snowplow_install_1 ( "schema_vendor" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "schema_name" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "schema_format" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "schema_version" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "root_id" CHAR(36) ENCODE RAW NOT NULL, "root_tstamp" TIMESTAMP ENCODE LZO NOT NULL, "ref_root" VARCHAR(255) ENCODE RUNLENGTH NOT NULL, "ref_tree" VARCHAR(1500) ENCODE RUNLENGTH NOT NULL, "ref_parent" VARCHAR(255) ENCODE RUNLENGTH NOT NULL, "adgroup_name" VARCHAR(256) ENCODE LZO, "adid" VARCHAR(256) ENCODE LZO, "android_id" VARCHAR(256) ENCODE LZO, "android_id_md5" VARCHAR(256) ENCODE LZO, "api_level" VARCHAR(256) ENCODE LZO, "app_id" VARCHAR(4096) ENCODE LZO, "app_name" VARCHAR(256) ENCODE LZO, "app_name_dashboard" VARCHAR(256) ENCODE LZO, "campaign_name" VARCHAR(256) ENCODE LZO, "city" VARCHAR(256) ENCODE LZO, "click_attribution_window" VARCHAR(256) ENCODE LZO, "click_referer" VARCHAR(256) ENCODE LZO, "click_time" VARCHAR(256) ENCODE LZO, "connection_type" VARCHAR(256) ENCODE LZO, "country" VARCHAR(256) ENCODE LZO, "created_at" VARCHAR(256) ENCODE LZO, "creative_name" VARCHAR(256) ENCODE LZO, "device_name" VARCHAR(256) ENCODE LZO, "device_type" VARCHAR(256) ENCODE LZO, "engagement_time" VARCHAR(256) ENCODE LZO, "environment" VARCHAR(256) ENCODE LZO, "fb_adgroup_id" VARCHAR(256) ENCODE LZO, "fb_adgroup_name" VARCHAR(256) ENCODE LZO, "fb_campaign_group_id" VARCHAR(256) ENCODE LZO, "fb_campaign_group_name" VARCHAR(256) ENCODE LZO, "fb_campaign_id" VARCHAR(256) ENCODE LZO, "fb_campaign_name" VARCHAR(256) ENCODE LZO, "gclid" VARCHAR(256) ENCODE LZO, "gps_adid" VARCHAR(256) ENCODE LZO, "gps_adid_md5" VARCHAR(256) ENCODE LZO, "idfa" VARCHAR(256) ENCODE LZO, "idfa_android_id" VARCHAR(256) ENCODE LZO, "idfa_or_gps_adid" VARCHAR(256) ENCODE LZO, "idfa_md5" VARCHAR(256) ENCODE LZO, "idfa_md5_hex" VARCHAR(256) ENCODE LZO, "idfv" VARCHAR(256) ENCODE LZO, "impression_attribution_window" VARCHAR(256) ENCODE LZO, "impression_based" VARCHAR(256) ENCODE LZO, "inactive_user_definition" VARCHAR(256) ENCODE LZO, "installed_at" VARCHAR(256) ENCODE LZO, "installed_at_hour" VARCHAR(256) ENCODE LZO, "ip_address" VARCHAR(256) ENCODE LZO, "is_organic" VARCHAR(256) ENCODE LZO, "isp" VARCHAR(256) ENCODE LZO, "label" VARCHAR(256) ENCODE LZO, "language" VARCHAR(256) ENCODE LZO, "mac_md5" VARCHAR(256) ENCODE LZO, "mac_sha1" VARCHAR(256) ENCODE LZO, "match_type" VARCHAR(256) ENCODE LZO, "network_name" VARCHAR(256) ENCODE LZO, "os_name" VARCHAR(256) ENCODE LZO, "os_version" VARCHAR(256) ENCODE LZO, "reattributed_at" VARCHAR(256) ENCODE LZO, "reattribution_attribution_window" VARCHAR(256) ENCODE LZO, "referrer" VARCHAR(256) ENCODE LZO, "reftag" VARCHAR(256) ENCODE LZO, "rejection_reason" VARCHAR(256) ENCODE LZO, "sdk_version" VARCHAR(256) ENCODE LZO, "store" VARCHAR(256) ENCODE LZO, "timezone" VARCHAR(256) ENCODE LZO, "tracker" VARCHAR(256) ENCODE LZO, "tracker_name" VARCHAR(256) ENCODE LZO, "tracking_enabled" VARCHAR(256) ENCODE LZO, "tweet_id" VARCHAR(256) ENCODE LZO, "twitter_line_item_id" VARCHAR(256) ENCODE LZO, "user_agent" VARCHAR(256) ENCODE LZO, "win_adid" VARCHAR(256) ENCODE LZO, "win_hwid" VARCHAR(256) ENCODE LZO, "win_naid" VARCHAR(256) ENCODE LZO, "win_udid" VARCHAR(256) ENCODE LZO, FOREIGN KEY (root_id) REFERENCES atomic.events(event_id) ) DISTSTYLE KEY DISTKEY (root_id) SORTKEY (root_tstamp); COMMENT ON TABLE com_adjust_snowplow_install_1 IS 'iglu:com.adjust.snowplow/install/jsonschema/1-0-0'; ``` ### 4. Create the correpsonding JSONpaths file Finally add the following JSONpaths file to your jsonpaths folder (as `com.adjust.snowplow/install_1.json`). Your JSONpaths file should have already been auto-generated using schema-guru or Igluctl: ```json { "jsonpaths": [ "$.schema.vendor", "$.schema.name", "$.schema.format", "$.schema.version", "$.hierarchy.rootId", "$.hierarchy.rootTstamp", "$.hierarchy.refRoot", "$.hierarchy.refTree", "$.hierarchy.refParent", "$.data.adgroup_name", "$.data.adid", "$.data.android_id", "$.data.android_id_md5", "$.data.api_level", "$.data.app_id", "$.data.app_name", "$.data.app_name_dashboard", "$.data.campaign_name", "$.data.city", "$.data.click_attribution_window", "$.data.click_referer", "$.data.click_time", "$.data.connection_type", "$.data.country", "$.data.created_at", "$.data.creative_name", "$.data.device_name", "$.data.device_type", "$.data.engagement_time", "$.data.environment", "$.data.fb_adgroup_id", "$.data.fb_adgroup_name", "$.data.fb_campaign_group_id", "$.data.fb_campaign_group_name", "$.data.fb_campaign_id", "$.data.fb_campaign_name", "$.data.gclid", "$.data.gps_adid", "$.data.gps_adid_md5", "$.data.idfa", "$.data.idfa-android-id", "$.data.idfa-or-gps-adid", "$.data.idfa_md5", "$.data.idfa_md5_hex", "$.data.idfv", "$.data.impression_attribution_window", "$.data.impression_based", "$.data.inactive_user_definition", "$.data.installed_at", "$.data.installed_at_hour", "$.data.ip_address", "$.data.is_organic", "$.data.isp", "$.data.label", "$.data.language", "$.data.mac_md5", "$.data.mac_sha1", "$.data.match_type", "$.data.network_name", "$.data.os_name", "$.data.os_version", "$.data.reattributed_at", "$.data.reattribution_attribution_window", "$.data.referrer", "$.data.reftag", "$.data.rejection_reason", "$.data.sdk_version", "$.data.store", "$.data.timezone", "$.data.tracker", "$.data.tracker_name", "$.data.tracking_enabled", "$.data.tweet_id", "$.data.twitter_line_item_id", "$.data.user_agent", "$.data.win_adid", "$.data.win_hwid", "$.data.win_naid", "$.data.win_udid" ] } ``` ### 5. Extending the setup for reattribution events To extend the setup for reattribution events you work through the same process as for installation events: #### 5.1 Setup the callback E.g. by creating the following callback for a reattribution event: ```text http://mycollector.mydomain.com/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.adjust.snowplow%2Freattribute%2Fjsonschema%2F1-0-0&app_id={app_id}&app_name={app_name}&app_name_dashboard={app_name_dashboard}&store={store}&tracker={tracker}&tracker_name={tracker_name}&last_tracker={last_tracker}&last_tracker_name={last_tracker_name}&network_name={network_name}&campaign_name={campaign_name}&adgroup_name={adgroup_name}&creative_name={creative_name}&impression_based={impression_based}&is_organic={is_organic}&gclid={gclid}&click_attribution_window={click_attribution_window}&impression_attribution_window={impression_attribution_window}&reattribution_attribution_window={reattribution_attribution_window}&inactive_user_definition={inactive_user_definition}&adid={adid}&idfa={idfa}&android_id={android_id}&android_id_md5={android_id_md5}&mac_sha1={mac_sha1}&mac_md5={mac_md5}&idfa-android-id={idfa||android_id}&idfa-or-gps-adid={idfa||gps_adid}&idfa_md5={idfa_md5}&idfa_md5_hex={idfa_md5_hex}&idfv={idfv}&gps_adid={gps_adid}&gps_adid_md5={gps_adid_md5}&win_udid={win_udid}&win_hwid={win_hwid}&win_naid={win_hwid}&win_adid={win_adid}&match_type={match_type}&reftag={reftag}&referrer={referrer}&user_agent={user_agent}&ip_address={ip_address}&click_time={click_time}&engagement_time={engagement_time}&installed_at={installed_at}&installed_at_hour={installed_at_hour}&created_at={created_at}&reattributed_at={reattributed_at}&connection_type={connection_type}&isp={isp}&city={city}&country={country}&language={language}&device_name={device_name}&device_type={device_type}&os_name={os_name}&api_level={api_level}&sdk_version={sdk_version}&os_version={os_version}&environment={environment}&tracking_enabled={tracking_enabled}&timezone={timezone}&time_spent={time_spent}&lifetime_session_count={lifetime_session_count}&deeplink={deeplink}&fb_campaign_group_name={fb_campaign_group_name}&fb_campaign_group_id={fb_campaign_group_id}&fb_campaign_name={fb_campaign_name}&fb_campaign_id={fb_campaign_id}&fb_adgroup_name={fb_adgroup_name}&fb_adgroup_id={fb_adgroup_id}&tweet_id={tweet_id}&twitter_line_item_id={twitter_line_item_id}&label={label} ``` As before, keep in mind that there is no requirement that the key in each key-value pair must match the spelling of the associated Adjust placeholder. For example, you might decide to use a property name for the `{impression_based}` placeholder that makes it clearer that this field contains a Boolean value and makes the naming more consistent with other Boolean fields: `is_impression_based={impression_based}`. The only requirement is that the keys must match the property names in your schema. #### 5.2 Create a corresponding jsonschema for the reattribution event E.g.: ```json { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "description": "Schema for Adjust reattribute event", "self": { "vendor": "com.adjust.snowplow", "name": "reattribute", "format": "jsonschema", "version": "1-0-0" }, "type": "object", "properties": { "app_id": { "type": "string" }, "app_name": { "type": "string", "maxLength": 1024 }, "app_name_dashboard": { "type": "string", "maxLength": 1024 }, "store": { "type": "string", "maxLength": 1024 }, "tracker": { "type": "string", "maxLength": 1024 }, "tracker_name": { "type": "string", "maxLength": 1024 }, "last_tracker": { "type": "string", "maxLength": 1024 }, "last_tracker_name": { "type": "string", "maxLength": 1024 }, "network_name": { "type": "string", "maxLength": 1024 }, "campaign_name": { "type": "string", "maxLength": 1024 }, "adgroup_name": { "type": "string", "maxLength": 1024 }, "creative_name": { "type": "string", "maxLength": 1024 }, "impression_based": { "type": "string", "maxLength": 1024 }, "is_organic": { "type": "string", "maxLength": 1024 }, "gclid": { "type": "string", "maxLength": 1024 }, "click_attribution_window": { "type": "string", "maxLength": 1024 }, "impression_attribution_window": { "type": "string", "maxLength": 1024 }, "reattribution_attribution_window": { "type": "string", "maxLength": 1024 }, "inactive_user_definition": { "type": "string", "maxLength": 1024 }, "adid": { "type": "string", "maxLength": 1024 }, "idfa": { "type": "string", "maxLength": 1024 }, "android_id": { "type": "string", "maxLength": 1024 }, "android_id_md5": { "type": "string", "maxLength": 1024 }, "mac_sha1": { "type": "string", "maxLength": 1024 }, "mac_md5": { "type": "string", "maxLength": 1024 }, "idfa-android-id": { "type": "string", "maxLength": 1024 }, "idfa-or-gps-adid": { "type": "string", "maxLength": 1024 }, "idfa_md5": { "type": "string", "maxLength": 1024 }, "idfa_md5_hex": { "type": "string", "maxLength": 1024 }, "idfv": { "type": "string", "maxLength": 1024 }, "gps_adid": { "type": "string", "maxLength": 1024 }, "gps_adid_md5": { "type": "string", "maxLength": 1024 }, "win_udid": { "type": "string", "maxLength": 1024 }, "win_hwid": { "type": "string", "maxLength": 1024 }, "win_naid": { "type": "string", "maxLength": 1024 }, "win_adid": { "type": "string", "maxLength": 1024 }, "match_type": { "type": "string", "maxLength": 1024 }, "reftag": { "type": "string", "maxLength": 1024 }, "referrer": { "type": "string", "maxLength": 1024 }, "user_agent": { "type": "string", "maxLength": 1024 }, "ip_address": { "type": "string", "maxLength": 1024 }, "click_time": { "type": "string", "maxLength": 1024 }, "engagement_time": { "type": "string", "maxLength": 1024 }, "installed_at": { "type": "string", "maxLength": 1024 }, "installed_at_hour": { "type": "string", "maxLength": 1024 }, "created_at": { "type": "string", "maxLength": 1024 }, "reattributed_at": { "type": "string", "maxLength": 1024 }, "connection_type": { "type": "string", "maxLength": 1024 }, "isp": { "type": "string", "maxLength": 1024 }, "city": { "type": "string", "maxLength": 1024 }, "country": { "type": "string", "maxLength": 1024 }, "language": { "type": "string", "maxLength": 1024 }, "device_name": { "type": "string", "maxLength": 1024 }, "device_type": { "type": "string", "maxLength": 1024 }, "os_name": { "type": "string", "maxLength": 1024 }, "api_level": { "type": "string", "maxLength": 1024 }, "sdk_version": { "type": "string", "maxLength": 1024 }, "os_version": { "type": "string", "maxLength": 1024 }, "environment": { "type": "string", "maxLength": 1024 }, "tracking_enabled": { "type": "string", "maxLength": 1024 }, "timezone": { "type": "string", "maxLength": 1024 }, "time_spent": { "type": "string", "maxLength": 1024 }, "fb_campaign_group_name": { "type": "string", "maxLength": 1024 }, "fb_campaign_group_id": { "type": "string", "maxLength": 1024 }, "fb_campaign_name": { "type": "string", "maxLength": 1024 }, "fb_campaign_id": { "type": "string", "maxLength": 1024 }, "fb_adgroup_name": { "type": "string", "maxLength": 1024 }, "fb_adgroup_id": { "type": "string", "maxLength": 1024 }, "tweet_id": { "type": "string", "maxLength": 1024 }, "twitter_line_item_id": { "type": "string", "maxLength": 1024 }, "label": { "type": "string", "maxLength": 1024 } }, "additionalProperties": true } ``` #### 5.3 Implement the corresponding SQL table definitions and jsonpaths Do this using [Schema Guru](https://github.com/snowplow/schema-guru) e.g. the SQL table definition by executing the following from the root of your schema registry: ```text $ /path/to/schema-guru-0.6.2 ddl --with-json-paths schemas/com.adjust.snowplow/reattribute ``` Or with [Igluctl](/docs/api-reference/iglu/igluctl-2/): ```bash $ /path/to/igluctl static generate --with-json-paths schemas/com.adjust.snowplow/reattribute ``` The generated SQL file (_Please note that this example might be outdated. Since first publishing this guide, Redshift has introduced the more efficient ZSTD encoding for column compression, which we have adopted as standard in Igluctl._): ```sql CREATE TABLE IF NOT EXISTS atomic.com_adjust_snowplow_reattribute_1 ( "schema_vendor" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "schema_name" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "schema_format" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "schema_version" VARCHAR(128) ENCODE RUNLENGTH NOT NULL, "root_id" CHAR(36) ENCODE RAW NOT NULL, "root_tstamp" TIMESTAMP ENCODE LZO NOT NULL, "ref_root" VARCHAR(255) ENCODE RUNLENGTH NOT NULL, "ref_tree" VARCHAR(1500) ENCODE RUNLENGTH NOT NULL, "ref_parent" VARCHAR(255) ENCODE RUNLENGTH NOT NULL, "adgroup_name" VARCHAR(1024) ENCODE LZO, "adid" VARCHAR(1024) ENCODE LZO, "android_id" VARCHAR(1024) ENCODE LZO, "android_id_md5" VARCHAR(1024) ENCODE LZO, "api_level" VARCHAR(1024) ENCODE LZO, "app_id" VARCHAR(4096) ENCODE LZO, "app_name" VARCHAR(1024) ENCODE LZO, "app_name_dashboard" VARCHAR(1024) ENCODE LZO, "campaign_name" VARCHAR(1024) ENCODE LZO, "city" VARCHAR(1024) ENCODE LZO, "click_attribution_window" VARCHAR(1024) ENCODE LZO, "click_time" VARCHAR(1024) ENCODE LZO, "connection_type" VARCHAR(1024) ENCODE LZO, "country" VARCHAR(1024) ENCODE LZO, "created_at" VARCHAR(1024) ENCODE LZO, "creative_name" VARCHAR(1024) ENCODE LZO, "device_name" VARCHAR(1024) ENCODE LZO, "device_type" VARCHAR(1024) ENCODE LZO, "engagement_time" VARCHAR(1024) ENCODE LZO, "environment" VARCHAR(1024) ENCODE LZO, "fb_adgroup_id" VARCHAR(1024) ENCODE LZO, "fb_adgroup_name" VARCHAR(1024) ENCODE LZO, "fb_campaign_group_id" VARCHAR(1024) ENCODE LZO, "fb_campaign_group_name" VARCHAR(1024) ENCODE LZO, "fb_campaign_id" VARCHAR(1024) ENCODE LZO, "fb_campaign_name" VARCHAR(1024) ENCODE LZO, "gclid" VARCHAR(1024) ENCODE LZO, "gps_adid" VARCHAR(1024) ENCODE LZO, "gps_adid_md5" VARCHAR(1024) ENCODE LZO, "idfa" VARCHAR(1024) ENCODE LZO, "idfa_android_id" VARCHAR(1024) ENCODE LZO, "idfa_or_gps_adid" VARCHAR(1024) ENCODE LZO, "idfa_md5" VARCHAR(1024) ENCODE LZO, "idfa_md5_hex" VARCHAR(1024) ENCODE LZO, "idfv" VARCHAR(1024) ENCODE LZO, "impression_attribution_window" VARCHAR(1024) ENCODE LZO, "impression_based" VARCHAR(1024) ENCODE LZO, "inactive_user_definition" VARCHAR(1024) ENCODE LZO, "installed_at" VARCHAR(1024) ENCODE LZO, "installed_at_hour" VARCHAR(1024) ENCODE LZO, "ip_address" VARCHAR(1024) ENCODE LZO, "is_organic" VARCHAR(1024) ENCODE LZO, "isp" VARCHAR(1024) ENCODE LZO, "label" VARCHAR(1024) ENCODE LZO, "language" VARCHAR(1024) ENCODE LZO, "last_tracker" VARCHAR(1024) ENCODE LZO, "last_tracker_name" VARCHAR(1024) ENCODE LZO, "mac_md5" VARCHAR(1024) ENCODE LZO, "mac_sha1" VARCHAR(1024) ENCODE LZO, "match_type" VARCHAR(1024) ENCODE LZO, "network_name" VARCHAR(1024) ENCODE LZO, "os_name" VARCHAR(1024) ENCODE LZO, "os_version" VARCHAR(1024) ENCODE LZO, "reattributed_at" VARCHAR(1024) ENCODE LZO, "reattribution_attribution_window" VARCHAR(1024) ENCODE LZO, "referrer" VARCHAR(1024) ENCODE LZO, "reftag" VARCHAR(1024) ENCODE LZO, "sdk_version" VARCHAR(1024) ENCODE LZO, "store" VARCHAR(1024) ENCODE LZO, "time_spent" VARCHAR(1024) ENCODE LZO, "timezone" VARCHAR(1024) ENCODE LZO, "tracker" VARCHAR(1024) ENCODE LZO, "tracker_name" VARCHAR(1024) ENCODE LZO, "tracking_enabled" VARCHAR(1024) ENCODE LZO, "tweet_id" VARCHAR(1024) ENCODE LZO, "twitter_line_item_id" VARCHAR(1024) ENCODE LZO, "user_agent" VARCHAR(1024) ENCODE LZO, "win_adid" VARCHAR(1024) ENCODE LZO, "win_hwid" VARCHAR(1024) ENCODE LZO, "win_naid" VARCHAR(1024) ENCODE LZO, "win_udid" VARCHAR(1024) ENCODE LZO, FOREIGN KEY (root_id) REFERENCES atomic.events(event_id) ) DISTSTYLE KEY DISTKEY (root_id) SORTKEY (root_tstamp); COMMENT ON TABLE atomic.com_adjust_snowplow_reattribute_1 IS 'iglu:com.adjust.snowplow/reattribute/jsonschema/1-0-0'; ``` and the generated jsonpath file: ```text { "jsonpaths": [ "$.schema.vendor", "$.schema.name", "$.schema.format", "$.schema.version", "$.hierarchy.rootId", "$.hierarchy.rootTstamp", "$.hierarchy.refRoot", "$.hierarchy.refTree", "$.hierarchy.refParent", "$.data.adgroup_name", "$.data.adid", "$.data.android_id", "$.data.android_id_md5", "$.data.api_level", "$.data.app_id", "$.data.app_name", "$.data.app_name_dashboard", "$.data.campaign_name", "$.data.city", "$.data.click_attribution_window", "$.data.click_time", "$.data.connection_type", "$.data.country", "$.data.created_at", "$.data.creative_name", "$.data.device_name", "$.data.device_type", "$.data.engagement_time", "$.data.environment", "$.data.fb_adgroup_id", "$.data.fb_adgroup_name", "$.data.fb_campaign_group_id", "$.data.fb_campaign_group_name", "$.data.fb_campaign_id", "$.data.fb_campaign_name", "$.data.gclid", "$.data.gps_adid", "$.data.gps_adid_md5", "$.data.idfa", "$.data.idfa-android-id", "$.data.idfa-or-gps-adid", "$.data.idfa_md5", "$.data.idfa_md5_hex", "$.data.idfv", "$.data.impression_attribution_window", "$.data.impression_based", "$.data.inactive_user_definition", "$.data.installed_at", "$.data.installed_at_hour", "$.data.ip_address", "$.data.is_organic", "$.data.isp", "$.data.label", "$.data.language", "$.data.last_tracker", "$.data.last_tracker_name", "$.data.mac_md5", "$.data.mac_sha1", "$.data.match_type", "$.data.network_name", "$.data.os_name", "$.data.os_version", "$.data.reattributed_at", "$.data.reattribution_attribution_window", "$.data.referrer", "$.data.reftag", "$.data.sdk_version", "$.data.store", "$.data.time_spent", "$.data.timezone", "$.data.tracker", "$.data.tracker_name", "$.data.tracking_enabled", "$.data.tweet_id", "$.data.twitter_line_item_id", "$.data.user_agent", "$.data.win_adid", "$.data.win_hwid", "$.data.win_naid", "$.data.win_udid" ] } ``` You're now all set for capturing reattribution events as well as installation events. ### 6. Tailoring the setup #### 6.1 Maximalist vs minimalist setup The above setup is a maximalist setup: we've decided to grab as much data from Adjust as Adjust make available with their installation and reattribution events. Our corresponding callback, schema, SQL table definition and jsonpath file are all very long. (Because they contain all the different fields.) You can choose to set this up for a more limited set of fields. In this case you'd simply remove the name value pairs from the callback, remove them from the schema, jsonpath and sql table definitions. You might also want to setup Callbacks for other Adjust events, if you want to add more than the attribution data to your Snowplow data set. #### 6.2 Adding custom parameters Adjust supports sending your own [custom parameters](https://docs.adjust.com/en/callbacks/#reference-custom-sdk-parameters) with each callback. To include these custom parameters in your Adjust-Snowplow integration, all you need to do is extend the schemas for each Adjust event types with the custom parameters you want to record into Snowplow. Note that you **do not need to update the webhook itself**. Whilst you do need to list each Adjust placeholder that you want passed into Snowplow on the webhook, Adjust will automatically add every custom parameter you record in it onto the callback. So as long as your event schema includes those custom parameters, they will automatically be fetched. Once you've updated your schema to include the custom parameters, you'll need to similarly make sure that your corresponding Redshift table definition and jsonpath files have the additional parameters in too. ### 7. Integrating callbacks from other third parties Adjust is not the only company to provide event-level data via a Callback or Webhook. Zendesk is another example. The same approach / methodology should work for Zendesk and any similar provider that lets you specify the Callback URL and individual data points passed on the querystring. --- # Iglu webhook > Track self-describing events via GET or POST requests with custom JSON schemas for flexible third-party webhook integrations. > Source: https://docs.snowplow.io/docs/sources/webhooks/iglu-webhook/ This webhook adapter lets you track events sent via a `GET` or `POST` request containing an [Iglu](https://github.com/snowplow/iglu)-compatible event payload. You can use this adapter with vendors who allow you define your **own** event types for "postback" events or custom Webhook events. ## Setup Integrating Iglu-compatible webhooks into Snowplow is a two-stage process: 1. Configure your third-party system to send Iglu-compatible events to Snowplow 2. Setup the appropriate JSON Schema for each Iglu-compatible event you are sending through ## Your webhook The Iglu webhook adapter supports events send in as `GET` and `POST` requests. ### Path We use a special path to tell Snowplow that these events should be parsed as Iglu self-describing JSON events: ```markup http://{{COLLECTOR_URL}}/com.snowplowanalytics.iglu/v1?schema=<iglu schema uri>&... ``` ### Required fields You can send in whatever name-value pairs on the querystring that make sense for your event, but you **must** also include a `schema` parameter, which is set to a valid Iglu self-describing schema URI. The below examples all use a schema available on Iglu Central. However, you will likely want to create your own schema that describes the event for the platform you are receiving data from. The below examples use the `social_interaction` schema: ```text iglu:com.snowplowanalytics.snowplow/social_interaction/jsonschema/1-0-0 ``` ### Optional fields If you want to specify which app these events belong to, add an `aid` parameter as taken from the [Snowplow Tracker Protocol](/docs/fundamentals/canonical-event/#application-fields): ```text ...&aid=<company code>&... ``` You can also manually override the event's `platform` parameter like so: ```text ...&p=<platform code>&... ``` Supported platform codes can again be found in the [Snowplow Tracker Protocol](/docs/fundamentals/canonical-event/#application-fields); if not set, then the value for `platform` will default to `srv` for a server-side application. ### Example `GET` request Here is an example of an Iglu-compatible event sent as a `GET` request, for a [Social Interacton event](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/social_interaction/jsonschema/1-0-0), broken out onto multiple lines to make it easier to read: ```markup http://{{COLLECTOR_URL}}/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.snowplowanalytics.snowplow%2Fsocial_interaction%2Fjsonschema%2F1-0-0 &aid=mobile-attribution &p=mob &network=twitter &action=retweet ``` This will be converted by the Iglu webhook adapter into a self-describing JSON looking like this: ```json { "schema":"iglu:com.snowplowanalytics.snowplow/social_interaction/jsonschema/1-0-0", "data": { "network": "twitter", "action": "retweet" } } ``` The Snowplow enriched event containing this JSON will include `app_id` set to "mobile-attribution" and `platform` set to "mob". ### Example `POST` requests POST requests can be compiled in three different ways for the Iglu webhook: - As a full Self Describing JSON in the body - With a `?schema=<iglu schema uri>` in the querystring and JSON in the body - As a `x-www-form-urlencoded` payload **NOTE**: For the event to be accepted the `Content-Type` must be either: - `application/json` - `application/json; charset=utf-8` - `application/x-www-form-urlencoded` The below examples are written as `curl` requests as an example, however you can send the events using any tool or technology that supports sending POST requests. To send as a full Self Describing JSON in the body and a Content-Type of `application/json`: ```bash curl --request POST \ --url http://{{COLLECTOR_URL}}/com.snowplowanalytics.iglu/v1 \ --header 'Content-Type: application/json' \ --data '{ "schema":"iglu:com.snowplowanalytics.snowplow/social_interaction/jsonschema/1-0-0", "data": { "network": "twitter", "action": "retweet" } }' ``` To send with a `?schema=<iglu schema uri>` in the querystring and a data JSON in the body and Content-Type of `application/json`: ```bash curl --request POST \ --url 'http://{{COLLECTOR_URL}}/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.snowplowanalytics.snowplow%2Fsocial_interaction%2Fjsonschema%2F1-0-0' \ --header 'Content-Type: application/json' \ --data '{ "network": "twitter", "action": "retweet" }' ``` To send as a `x-www-form-urlencoded` payload: ```bash curl --request POST \ --url 'http://{{COLLECTOR_URL}}/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.snowplowanalytics.snowplow%2Fsocial_interaction%2Fjsonschema%2F1-0-0' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data 'network=twitter&action=retweet' ``` As with the `GET` request above you can also attach extra information into the querystring to help describe your event. The following parameters can be added: - `aid=` : The application ID - `p=` : The platform - `nuid=` : The network user ID - `eid=` : A custom event ID - `ttm=` : The true timestamp - `url=` : The page URL - `cv=` : The context vendor (deprecated) --- # Collect data from third parties with webhooks > Receive third-party event streams through webhooks for Adjust, Iglu, Iterable, MailGun, Mandrill, SendGrid, and Zendesk integrations. > Source: https://docs.snowplow.io/docs/sources/webhooks/ Snowplow enables you to collect events via the webhooks of supported third-party software. Webhooks allow the third-party software to send their own internal event streams to your Snowplow Collector for further processing. Webhooks are sometimes referred to as "streaming APIs" or "HTTP response APIs". | Webhook | Track | | ------------------------------------------------ | -------------------------------------------------------------------- | | [Adjust](/docs/sources/webhooks/adjust-webhook/) | Which marketing channels are driving mobile app installations | | [Iglu](/docs/sources/webhooks/iglu-webhook/) | Any [Iglu](/docs/api-reference/iglu/)-compatible GET or POST request | | [Iterable](/docs/sources/webhooks/iterable/) | Events provided by Iterable | | [MailGun](/docs/sources/webhooks/mailgun/) | Email activity logged by MailGun | | [Mandrill](/docs/sources/webhooks/mandrill/) | Email activity logged by Mandrill | | [SendGrid](/docs/sources/webhooks/sendgrid/) | Email activity logged by SendGrid | | [Zendesk](/docs/sources/webhooks/zendesk/) | Events logged by Zendesk | --- # Iterable webhook > Track email opens, clicks, SMS, push notifications, and other Iterable system webhook events directly into Snowplow. > Source: https://docs.snowplow.io/docs/sources/webhooks/iterable/ This webhook integration lets you track a variety of events provided by Iterable through their [System Webhooks](https://support.iterable.com/hc/en-us/articles/208013936-System-Webhooks). The event types include email opens and clicks, sent SMS or push notifications, and more. ## Configure Iterable to send events to Snowplow First, login to Iterable. Under **Integrations** in the top toolbar, click on **System webhooks**. Then select **Create webhook** above the list of webhooks on the right and enter the endpoint URL. The endpoint URL field is the URI to your Snowplow Collector. We use a special path to tell Snowplow that these events are generated by Iterable: ```markup https://<collector host>/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.iterable%2Fsystem_webhook%2Fjsonschema%2F1-0-0 ``` You may want to set the snowplow `aid=` parameter in your URL query string to the company for which this webhook has been configured; this is the `app_id` parameter taken from the [Snowplow Tracker Protocol](/docs/events/), however this is optional. The company is also contained in the received messages. If you want, you can also manually override the event's `platform` parameter by appending a query string to the end of the URL, in combination or separately with aid above. Here is what the final URL would look like for a configured aid and platform: ```markup https://<collector host>/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.iterable%2Fsystem_webhook%2Fjsonschema%2F1-0-0&aid=<company>&p=<platform code> ``` Supported platform codes can again be found in the [Snowplow Tracker Protocol](/docs/fundamentals/canonical-event/#application-fields); if not set, then the value for `platform` will default to `srv` for a server-side application. Having entered the endpoint URL, click on **Create**. This will open a form where you can set up authentication of the callbacks – leave that to "None". In the bottom part of the page, you will be able to select which events to subscribe to as shown in this screenshot: ![](/assets/images/iterable-events-810x1024-9c02a317d11b6989116049079f224476.png) That’s it – when you save the configuration, Iterable events should automatically flow through into your data warehouse. --- # MailGun webhook > Track MailGun email events including delivered messages, hard bounces, spam complaints, unsubscribes, clicks, and opens. > Source: https://docs.snowplow.io/docs/sources/webhooks/mailgun/ This webhook integration lets you track a variety of events logged by [Mailgun](https://www.mailgun.com/). Available events are: - Delivered Messages - Dropped Messages - Hard Bounces - Spam Complaints - Unsubscribes - Clicks - Opens ## Compatibility - [R97 Knossos](https://github.com/snowplow/snowplow/releases/tag/r97-knossos)+ (`POST`-capable collectors only) - [Mailgun webhook API](https://documentation.mailgun.com/en/latest/user_manual.html#webhooks) ## Setup Integrating Mailgun's webhooks into Snowplow is a two-stage process: 1. Configure Mailgun to send events to Snowplow 2. (Optional) Create the Mailgun events tables into Amazon Redshift ## Mailgun First login to Mailgun. Select **Webhooks** from the top panel. ![](/assets/images/mailgun-1-89f9f315ff609950647cb2558bfeff0d.png) Then select the domain for which you want to configure snowplow. ![](/assets/images/mailgun-2-c0145bb44af8cad166858be686c3b45b.png) Once you have the desired domain selected, you can configure snowplow for the type of events that interest you. Following is an example configuring the "Spam Complaints" event. The process is identical for all events. Click on the cross next to the event type you woudl like to configure. In this case "Spam Complaints": ![](/assets/images/mailgun-3-3ce9ff3a625c2511140b49e8a9bbb9b5.png) Then set the collector URL: ```markup http://<collector host>/com.mailgun/v1 ``` Finally click on **Set Webhook URL**. You can optionally click on **Test Webhook** to verify that your collector instance is reachable from Mailgun's servers. You may want to set the snowplow `aid=` parameter in your URL query string to the company for which this webhook has been configured; this is the `app_id` parameter taken from the [Snowplow Tracker Protocol](/docs/events/), however this is optional. The company is also contained in the received messages. If you want, you can also manually override the event's `platform` parameter by appending a query string to the end of the URL, in combination or separately with aid above. Here is what the final URL would look like for a configured aid and platform: ```markup http://<collector host>/com.mailgun/v1?aid=<company>&p=<platform code> ``` Supported platform codes can again be found in the [Snowplow Tracker Protocol](/docs/events/); if not set, then the value for `platform` will default to `srv` for a server-side application. --- # Mandrill webhook > Track Mandrill email events including sent messages, bounces, opens, clicks, spam reports, rejections, and unsubscribes. > Source: https://docs.snowplow.io/docs/sources/webhooks/mandrill/ This webhook integration lets you track a variety of events logged by [Mandrill](https://mandrill.com/). Available events are: - Message sent - Message hard bounced - Message soft bounced - Message opened - Message marked as spam - Message rejected - Message delayed - Message clicked - Recipient unsubscribed ## Compatibility - [Snowplow 0.9.14](https://github.com/snowplow/snowplow/releases/tag/0.9.14)+ (`POST`-capable collectors only) - [Mandrill webhook API](http://help.mandrill.com/entries/21738186-Introduction-to-Webhooks) ## Setup Integrating Mandrill's webhooks into Snowplow is a two-stage process: 1. Configure Mandrill to send events to Snowplow 2. (Optional) Create the Mandrill events tables into Amazon Redshift ## Mandrill First login to Mandrill and click on the **Settings** button which will be on the left hand side of the screen. Once in the Settings menu click on **Webhooks** and then click the **Add a Webhook** button. Now we can start to setup the webhook by selecting what events we want our Mandrill Webhook to trigger! - Please note: 'Rejection Whitelist Changes' & 'Rejection Blacklist Changes' are not supported by the Snowplow MandrillAdapter. ![](/assets/images/mandrill-1-cd468779fceea6dfa60e1b0368accad1.png) Once we have selected what events we want to record we need to fill in the **Post To URL** field: - For the this field you will need to provide the URI to your Snowplow Collector. We use a special path to tell Snowplow that these events are generated by Mandrill: ```markup http://<collector host>/com.mandrill/v1?aid=<company code> ``` The `aid=` name-value pair in your URI's querystring is optional; this is the `app_id` parameter taken from the [Snowplow Tracker Protocol](/docs/events/). You can use it to specify which company in Mandrill these call complete events belong to. Putting it all together, our setup screen now looks like this: ![](/assets/images/mandrill-2-723fe55962c02e12fae65adb12b1eeab.png) If you want, you can also manually override the event's `platform` parameter like so: ```markup http://<collector host>/com.mandrill/v1?aid=<company code>&p=<platform code> ``` Supported platform codes can again be found in the [Snowplow Tracker Protocol](/docs/events/); if not set, then the value for `platform` will default to `srv` for a server-side application. Once you click the **Create Webhook** button it will attempt to authenticate that the Collector actually exists and is ready to receive events. If everything is setup correctly it will return to the previous page and you will now see your new Webhook listed! That's it - with this table deployed, your Mandrill events should automatically flow through into Redshift. --- # SendGrid webhook > Track SendGrid email notification events including processed, delivered, bounced, opened, clicked, unsubscribed, and spam report events. > Source: https://docs.snowplow.io/docs/sources/webhooks/sendgrid/ This webhook integration lets you track a variety of events logged by [SendGrid](http://sendgrid.com/). Available events are: - Processed - Dropped - Delivered - Deferred - Bounce - Open - Click - Spam Report - Unsubscribe - Group Unsubscribe - Group Resubscribe ## Compatibility - The support of the latest version of Sendgrid webhook has been introduced in [Snowplow R114 Polonnaruwa](https://github.com/snowplow/snowplow/releases/tag/r114-polonnaruwa) - [SendGrid webhook API](https://docs.sendgrid.com/for-developers/tracking-events/event) ## Setup Integrating SendGrid's webhooks into Snowplow is a two-stage process: 1. Configure SendGrid to send events to Snowplow 2. (Optional) Create the SendGrid events tables for Amazon Redshift ## Configure SendGrid First login to SendGrid. Select **Settings** from the menu panel along the left-hand side of the screen. You should then navigate in the expanded list to the **Mail Settings** page. Select **Event Notification** from the list by clicking the row. Ensure it's switched **ON** in order to send events to Snowplow. Click **edit** on the top right-hand side of the **Event Notification** dropdown. For the **HTTP POST URL** field you will need to provide the URI to your Snowplow Collector. We use a special path to tell Snowplow that these events are generated by SendGrid: ```markup http://<collector host>/com.sendgrid/v3 ``` Our Webhooks setup page should look like this after we have added our **HTTP POST URL**: ![](/assets/images/sendgrid-388161440c37b156c61747fc25dda42b.png) If you want, you can also manually override the event's `platform` parameter by appending a query string to the end of the URL so: ```markup http://<collector host>/com.sendgrid/v3?p=<platform code> ``` Supported platform codes can again be found in the [Snowplow Tracker Protocol](/docs/events/); if not set, then the value for `platform` will default to `srv` for a server-side application. The other values you can set up manually in the similar fashion are `nuid`, `aid`, `cv`, `eid`, `ttm`, and `url`. Before we save our SendGrid webhook we can configure what types of events SendGrid will send to our webhook and what channels will trigger these events. Simply select the boxes that are applicable to you and SendGrid will send these events to our webhook. --- # Zendesk webhook > Track Zendesk ticket events and user context data including ticket creation, updates, and associated requester, assignee, and submitter information. > Source: https://docs.snowplow.io/docs/sources/webhooks/zendesk/ You can configure Zendesk to automatically send `POST` requests to a (Clojure or Scala) collector. The first step is to set up a Zendesk "extension" pointing at the collector. Log in to Zendesk. Click the cogwheel-shaped "Admin" icon located at the bottom-left corner of the Dashboard page to take you to the _Admin_ page. In the "SETTINGS" menu, click on "Extensions": ![](/assets/images/extensions-button-371bebc45257a1de8edc107adb4d5bef.png) Click "add target": ![](/assets/images/add-extension-d0f1cadc80c515fba9ba2f8780682330.png) Choose "HTTP target" from the list of target types to add: ![](/assets/images/http-target-ea24a5ba3fc11a7065b41456521f8496.png) Name the new extension something like "Snowplow Collector - Iglu POST". The "Iglu POST" here represents the fact we will be sending Zendesk events and contexts to [Iglu webhook adapter](/docs/sources/webhooks/iglu-webhook/) via `POST` request. In the **URL** field, enter `https://{{collector_domain}}/com.snowplowanalytics.iglu/v1?aid=zendesk`, replacing `{{collector_domain}}` with your collector domain. You can optionally have `?aid={{my_zendesk_namespace}}` added to this URL, where `{{my_zendesk_namespace}}` is a label for the application (here: "zendesk"). This label will be attached to all events fired by the extension, so you can later check where a given event came from (useful if you have more than one Zendesk account). Set the **Method** field to "POST" and the **Content type** to "JSON" from the drop-down lists. Select "Create Target" and click the _**Submit**_ button. ![](/assets/images/extension-form-4fc613af29cbde8a2876f6a6e249e8df.png) We have set up our collector as a Zendesk extension. We can now add a trigger which sends `POST` requests to the collector whenever certain events occur. ## 2. Setting up a trigger for Zendesk event ### Setting up trigger conditions From the _Admin_ page, select "Triggers" from the "BUSINESS RULES" menu and click "add trigger": ![](/assets/images/add-trigger-button-2b2cf08ed2b193b48b6cddd69ec05662.png) Name the trigger something like "Ticket created or updated" to reflect Zendesk data will be send on ticket creation and update events. Under "Meet ANY of the following conditions" header click _**Add condition**_ button to add 2 "Ticket: Is..." conditions and set them to "Created" and "Updated" respectively. ![](/assets/images/trigger-conditions-291f41055b3389ade6c909f47ed2bfba.png) ### Setting up body for ticket event In the "Actions" section, click on _**Add action**_ button and select "Notify target" and "Snowplow Collector - Iglu POST". In the _**JSON body**_ box, paste the following: ```json { "schema": "iglu:com.zendesk.snowplow/ticket_updated/jsonschema/1-0-0", "data": { "via": "{{ticket.via}}", "ticketType": {% if ticket.ticket_type.size > 0 %}"{{ticket.ticket_type}}"{% else %}null{% endif %}, "updatedAt": "{{ticket.updated_at_with_timestamp}}", "ticketId": {{ticket.id}}, "ticketTitle": "{{ticket.title}}", "priority": {% if ticket.priority.size > 0 %}"{{ticket.priority}}"{% else %}null{% endif %}, "inBusinessHours": {{ticket.in_business_hours}}, "createdAt": "{{ticket.created_at_with_timestamp}}", "account": "{{ticket.account}}", "brand": "{{ticket.brand.name}}", "url": "{{ticket.link}}", "externalId": {% if ticket.external_id.size > 0 %}"{{ticket.external_id}}"{% else %}null{% endif %}, "organizationName": "{{ticket.organization.name}}", "organizationId": {% if ticket.requester.organization.id %}{{ticket.requester.organization.id}}{% else %}null{% endif %}, "status": "{{ticket.status}}", "dueDate": {% if ticket.due_date_with_timestamp.size > 0 %}"{{ticket.due_date_with_timestamp}}"{% else %}null{% endif %}, "tags": {% if ticket.tags.size > 0 %}"{{ticket.tags}}"{% else %}null{% endif %}, "ccNames": "{{ticket.cc_names}}", "groupAssigned": "{{ticket.group.name}}", "latestCommentAuthorName": "{{ticket.latest_comment.author.name}}", "latestComment": "{{ticket.latest_comment.value}}", "latestCommentIsPublic": {{ticket.latest_comment.is_public}} } } ``` ![](/assets/images/json-body-e06e6da2d1ecf2abfa96172d7e17cbe5.png) _NOTE:_ Ignore the warning on the left-hand side of the _**JSON body**_ textbox. It is due to usage of [Liquid markup](https://shopify.github.io/liquid/) in JSON. ### Setting up user contexts #### Setting up body for ticket requester In the "Actions" section, select the 2nd "Notify target" and "Snowplow Collector - Iglu POST" extension. In the _**JSON body**_ box, paste the following: ```json { "schema": "iglu:com.zendesk.snowplow/user/jsonschema/1-0-0", "data": { "ticketId": {{ticket.id}}, "updatedAt": "{{ticket.updated_at_with_timestamp}}", "type": "requester", "firstName": {% if ticket.requester.first_name.size > 0 %}"{{ticket.requester.first_name}}"{% else %}null{% endif %}, "lastName": {% if ticket.requester.last_name.size > 0 %}"{{ticket.requester.last_name}}"{% else %}null{% endif %}, "language": {% if ticket.requester.language.size > 0 %}"{{ticket.requester.language}}"{% else %}null{% endif %}, "tags": {% if ticket.requester.tags.size > 0 %}"{{ticket.requester.tags}}"{% else %}null{% endif %}, "locale": {% if ticket.requester.locale.size > 0 %}"{{ticket.requester.locale}}"{% else %}null{% endif %}, "notes": {% if ticket.requester.notes.size > 0 %}"{{ticket.requester.notes}}"{% else %}null{% endif %}, "timeZone": {% if ticket.requester.time_zone.size > 0 %}"{{ticket.requester.time_zone}}"{% else %}null{% endif %}, "userId": {% if ticket.requester.id %}{{ticket.requester.id}}{% else %}null{% endif %}, "phone": {% if ticket.requester.phone.size > 0 %}"{{ticket.requester.phone}}"{% else %}null{% endif %}, "extendedRole": {% if ticket.requester.extended_role.size > 0 %}"{{ticket.requester.extended_role}}"{% else %}null{% endif %}, "role": {% if ticket.requester.role.size > 0 %}"{{ticket.requester.role}}"{% else %}null{% endif %}, "details": {% if ticket.requester.details.size > 0 %}"{{ticket.requester.details}}"{% else %}null{% endif %}, "signature": {% if ticket.requester.signature.size > 0 %}"{{ticket.requester.signature}}"{% else %}null{% endif %}, "organization": {% if ticket.requester.organization.size > 0 %}"{{ticket.requester.organization}}"{% else %}null{% endif %}, "externalId": {% if ticket.requester.external_id.size > 0 %}"{{ticket.requester.external_id}}"{% else %}null{% endif %}, "email": {% if ticket.requester.email.size > 0 %}"{{ticket.requester.email}}"{% else %}null{% endif %} } } ``` #### Setting up body for ticket assignee In the "Actions" section, select the 3nd "Notify target" and "Snowplow Collector - Iglu POST" extension. In the _**JSON body**_ box, paste the following: ```json { "schema": "iglu:com.zendesk.snowplow/user/jsonschema/1-0-0", "data": { "ticketId": {{ticket.id}}, "updatedAt": "{{ticket.updated_at_with_timestamp}}", "type": "assignee", "firstName": {% if ticket.assignee.first_name.size > 0 %}"{{ticket.assignee.first_name}}"{% else %}null{% endif %}, "lastName": {% if ticket.assignee.last_name.size > 0 %}"{{ticket.assignee.last_name}}"{% else %}null{% endif %}, "language": {% if ticket.assignee.language.size > 0 %}"{{ticket.assignee.language}}"{% else %}null{% endif %}, "tags": {% if ticket.assignee.tags.size > 0 %}"{{ticket.assignee.tags}}"{% else %}null{% endif %}, "locale": {% if ticket.assignee.locale.size > 0 %}"{{ticket.assignee.locale}}"{% else %}null{% endif %}, "notes": {% if ticket.assignee.notes.size > 0 %}"{{ticket.assignee.notes}}"{% else %}null{% endif %}, "timeZone": {% if ticket.assignee.time_zone.size > 0 %}"{{ticket.assignee.time_zone}}"{% else %}null{% endif %}, "userId": {% if ticket.assignee.id %}{{ticket.assignee.id}}{% else %}null{% endif %}, "phone": {% if ticket.assignee.phone.size > 0 %}"{{ticket.assignee.phone}}"{% else %}null{% endif %}, "extendedRole": {% if ticket.assignee.extended_role.size > 0 %}"{{ticket.assignee.extended_role}}"{% else %}null{% endif %}, "role": {% if ticket.assignee.role.size > 0 %}"{{ticket.assignee.role}}"{% else %}null{% endif %}, "details": {% if ticket.assignee.details.size > 0 %}"{{ticket.assignee.details}}"{% else %}null{% endif %}, "signature": {% if ticket.assignee.signature.size > 0 %}"{{ticket.assignee.signature}}"{% else %}null{% endif %}, "organization": {% if ticket.assignee.organization.size > 0 %}"{{ticket.assignee.organization}}"{% else %}null{% endif %}, "externalId": {% if ticket.assignee.external_id.size > 0 %}"{{ticket.assignee.external_id}}"{% else %}null{% endif %}, "email": {% if ticket.assignee.email.size > 0 %}"{{ticket.assignee.email}}"{% else %}null{% endif %} } } ``` #### Setting up body for ticket submitter In the "Actions" section, select the 4th "Notify target" and "Snowplow Collector - Iglu POST" extension. In the _**JSON body**_ box, paste the following: ```json { "schema": "iglu:com.zendesk.snowplow/user/jsonschema/1-0-0", "data": { "ticketId": {{ticket.id}}, "updatedAt": "{{ticket.updated_at_with_timestamp}}", "type": "submitter", "firstName": {% if ticket.submitter.first_name.size > 0 %}"{{ticket.submitter.first_name}}"{% else %}null{% endif %}, "lastName": {% if ticket.submitter.last_name.size > 0 %}"{{ticket.submitter.last_name}}"{% else %}null{% endif %}, "language": {% if ticket.submitter.language.size > 0 %}"{{ticket.submitter.language}}"{% else %}null{% endif %}, "tags": {% if ticket.submitter.tags.size > 0 %}"{{ticket.submitter.tags}}"{% else %}null{% endif %}, "locale": {% if ticket.submitter.locale.size > 0 %}"{{ticket.submitter.locale}}"{% else %}null{% endif %}, "notes": {% if ticket.submitter.notes.size > 0 %}"{{ticket.submitter.notes}}"{% else %}null{% endif %}, "timeZone": {% if ticket.submitter.time_zone.size > 0 %}"{{ticket.submitter.time_zone}}"{% else %}null{% endif %}, "userId": {% if ticket.submitter.id %}{{ticket.submitter.id}}{% else %}null{% endif %}, "phone": {% if ticket.submitter.phone.size > 0 %}"{{ticket.submitter.phone}}"{% else %}null{% endif %}, "extendedRole": {% if ticket.submitter.extended_role.size > 0 %}"{{ticket.submitter.extended_role}}"{% else %}null{% endif %}, "role": {% if ticket.submitter.role.size > 0 %}"{{ticket.submitter.role}}"{% else %}null{% endif %}, "details": {% if ticket.submitter.details.size > 0 %}"{{ticket.submitter.details}}"{% else %}null{% endif %}, "signature": {% if ticket.submitter.signature.size > 0 %}"{{ticket.submitter.signature}}"{% else %}null{% endif %}, "organization": {% if ticket.submitter.organization.size > 0 %}"{{ticket.submitter.organization}}"{% else %}null{% endif %}, "externalId": {% if ticket.submitter.external_id.size > 0 %}"{{ticket.submitter.external_id}}"{% else %}null{% endif %}, "email": {% if ticket.submitter.email.size > 0 %}"{{ticket.submitter.email}}"{% else %}null{% endif %} } } ``` #### Setting up body for current user In the "Actions" section, select the 5th (final) "Notify target" and "Snowplow Collector - Iglu POST" extention. In the _**JSON body**_ box, paste the following: ```json { "schema": "iglu:com.zendesk.snowplow/user/jsonschema/1-0-0", "data": { "ticketId": {{ticket.id}}, "updatedAt": "{{ticket.updated_at_with_timestamp}}", "type": "current_user", "firstName": {% if current_user.first_name.size > 0 %}"{{current_user.first_name}}"{% else %}null{% endif %}, "lastName": {% if current_user.last_name.size > 0 %}"{{current_user.last_name}}"{% else %}null{% endif %}, "language": {% if current_user.language.size > 0 %}"{{current_user.language}}"{% else %}null{% endif %}, "tags": {% if current_user.tags.size > 0 %}"{{current_user.tags}}"{% else %}null{% endif %}, "locale": {% if current_user.locale.size > 0 %}"{{current_user.locale}}"{% else %}null{% endif %}, "notes": {% if current_user.notes.size > 0 %}"{{current_user.notes}}"{% else %}null{% endif %}, "timeZone": {% if current_user.time_zone.size > 0 %}"{{current_user.time_zone}}"{% else %}null{% endif %}, "userId": {% if current_user.id %}{{current_user.id}}{% else %}null{% endif %}, "phone": {% if current_user.phone.size > 0 %}"{{current_user.phone}}"{% else %}null{% endif %}, "extendedRole": {% if current_user.extended_role.size > 0 %}"{{current_user.extended_role}}"{% else %}null{% endif %}, "role": {% if current_user.role.size > 0 %}"{{current_user.role}}"{% else %}null{% endif %}, "details": {% if current_user.details.size > 0 %}"{{current_user.details}}"{% else %}null{% endif %}, "signature": {% if current_user.signature.size > 0 %}"{{current_user.signature}}"{% else %}null{% endif %}, "organization": {% if current_user.organization.size > 0 %}"{{current_user.organization}}"{% else %}null{% endif %}, "externalId": {% if current_user.external_id.size > 0 %}"{{current_user.external_id}}"{% else %}null{% endif %}, "email": {% if current_user.email.size > 0 %}"{{current_user.email}}"{% else %}null{% endif %} } } ``` Submit the new trigger by clicking _**Create**_ button. It should look something like this: ![](/assets/images/submit-target-b44ec8e7ca6f1bf57767b3b2c853e35e.png) --- # WebView tracker (legacy) > Track events from web views in mobile hybrid apps by forwarding events to native iOS, Android, or React Native trackers with shared session data. > Source: https://docs.snowplow.io/docs/sources/webview-tracker/ The [Snowplow WebView Tracker](https://github.com/snowplow-incubator/snowplow-webview-tracker) allows you to track Snowplow events from web views in **mobile hybrid apps**. The current tracker version is 0.3.0. Hybrid apps are mobile apps that in addition to a native interface, provide part of the UI through an embedded web view. Snowplow events are tracked from both the native code (e.g. written in Swift or Kotlin) as well as the web view (in JavaScript). The goal is to have events that are tracked from the native code and web view share the same session, and appear as tracked with the same tracker. > **Tip:** We recommend using the [Snowplow web tracker](/docs/sources/web-trackers/) with [WebView plugin](/docs/sources/web-trackers/tracking-events/webview/) (which uses the WebView tracker as a dependency), rather than using this tracker directly. > > The WebView plugin automatically forwards all tracked events to the mobile tracker. Events must be manually tracked when using the WebView tracker by itself. The WebView tracker forwards tracked events to the native app code to be tracked by the Snowplow mobile trackers ([iOS, Android tracker](/docs/sources/mobile-trackers/hybrid-apps/), or [React Native](/docs/sources/react-native-tracker/hybrid-apps/)). The mobile trackers must be subscribed to the WebView messages. The mobile trackers will continue processing the events as if they were tracked from the native app code. For example, adding any configured context entities, and timestamps. The diagram below shows the interaction of the WebView and mobile trackers in hybrid apps. ```mermaid flowchart TB subgraph hybridApp[Hybrid Mobile App] subgraph webView[Web View] webViewCode[App logic] webViewTracker[Snowplow WebView tracker] webViewCode -- "Tracks events" --> webViewTracker end subgraph nativeCode[Native Code] nativeAppCode[App logic] nativeTracker[Snowplow iOS/Android/React Native tracker] nativeAppCode -- "Tracks events" --> nativeTracker end webViewTracker -- "Forwards events" --> nativeTracker end subgraph cloud[Cloud] collector[Snowplow Collector] end nativeTracker -- "Sends tracked events" --> collector ``` ## Installation You may choose to install the tracker as an npm package or by loading it through an HTML script tag. **Using npm:** To install the WebView tracker in your JavaScript or TypeScript app, add the npm package: ```bash npm install --save @snowplow/webview-tracker ``` You will then be able to use the functions provided by the WebView tracker as follows: ```typescript import { trackSelfDescribingEvent } from '@snowplow/webview-tracker'; ``` **Using Snowplow tag:** You may download the `sp.js` file from the [Releases section on GitHub](https://github.com/snowplow-incubator/snowplow-webview-tracker/releases), self-host it, and load to your page using the following tag: ```html <script type="text/javascript" async=1> ;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","{{URL to sp.js}}","snowplow")); </script> ``` > **Note:** Replace the `{{URL to sp.js}}` with the URL to the `sp.js` file in the snippet. *** In addition, you will need to implement the [iOS, Android](/docs/sources/mobile-trackers/hybrid-apps/), or [React Native](/docs/sources/react-native-tracker/hybrid-apps/) tracker and subscribe to WebView messages. ## Tracking events To track events, simply call their corresponding functions given the event data. The events will be processed based on the equivalent mobile event types after forwarding. The following functions are available: | Method | Event type tracked | | -------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | `trackSelfDescribingEvent` | Track a custom event based on "self-describing" JSON schema | | `trackStructEvent` | Track a semi-custom structured event | | `trackScreenView` | Track a view of a screen in the app | | `trackPageView` | Track a Web page visit | | `trackWebViewEvent` | Track any Snowplow event (used internally by the [WebView plugin](/docs/sources/web-trackers/tracking-events/webview/)) | All the methods share common features and parameters. Every type of event can have optional entities added. ### Self-describing (custom) Use the `trackSelfDescribingEvent` function to track a [fully custom event](/docs/fundamentals/events/). | Argument | Description | Required? | | --------- | ---------------------------------------------------------- | --------- | | `event` | Self-describing event, with `schema` and `data` properties | Yes | | `context` | List of context entities as self-describing JSON | No | **Installed using npm:** ```javascript trackSelfDescribingEvent({ event: { schema: 'iglu:com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1', data: { targetUrl: 'http://a-target-url.com' } } }); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackSelfDescribingEvent', { event: { schema: 'iglu:com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1', data: { targetUrl: 'http://a-target-url.com' } } }); ``` *** ### Screen views Use `trackScreenView` to track a user viewing a screen (or similar) within your app. This is the page view equivalent for apps that are not web pages. | Argument | Description | Required? | | ---------------- | ---------------------------------------------------------- | --------- | | `name` | The human-readable name of the screen viewed | Yes | | `id` | The id (UUID v4) of screen that was viewed | Yes | | `type` | The type of screen that was viewed | No | | `previousName` | The name of the previous screen that was viewed | No | | `previousType` | The type of screen that was viewed | No | | `previousId` | The id (UUID v4) of the previous screen that was viewed | No | | `transitionType` | The type of transition that led to the screen being viewed | No | | `context` | List of context entities as self-describing JSON | No | **Installed using npm:** ```javascript trackScreenView({ id: '2c295365-eae9-4243-a3ee-5c4b7baccc8f', name: 'home', type: 'full', transitionType: 'none' }); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackScreenView', { id: '2c295365-eae9-4243-a3ee-5c4b7baccc8f', name: 'home', type: 'full', transitionType: 'none' }); ``` *** ### Page views The `PageViewEvent` may be used to track page views on the web. The event is designed to track web page views and automatically captures page title, referrer, and URL. | Argument | Description | Required? | | --------- | ------------------------------------------------ | --------- | | `title` | Override the page title | No | | `context` | List of context entities as self-describing JSON | No | **Installed using npm:** ```javascript trackPageView(); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackPageView', {}); ``` *** ### Structured Track a semi-custom Structured event. | Argument | Description | Required? | | ---------- | -------------------------------------------------------------- | --------- | | `category` | The grouping of structured events which this action belongs to | Yes | | `action` | Defines the type of user interaction which this event involves | Yes | | `label` | Often used to refer to the "object" the action is performed on | No | | `property` | Describing the "object", or the action performed on it | No | | `value` | Provides numerical data about the event | No | | `context` | List of context entities as self-describing JSON | No | **Installed using npm:** ```javascript trackStructEvent({ category: 'shop', action: 'add-to-basket', label: 'Add To Basket', property: 'pcs', value: 2.00, }); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackStructEvent', { category: 'shop', action: 'add-to-basket', label: 'Add To Basket', property: 'pcs', value: 2.00, }); ``` *** ### Web view > **Info:** This method is available since v0.3.0. This method is used internally by the WebView plugin. We recommend implementing the web tracker with WebView plugin rather than using this directly. Use this method to track any kind of Snowplow event e.g. a page ping. You will need to define the event name yourself, e.g. "pp" for page ping. It also allows you to set a tracker version, to help distinguish between native and WebView events (e.g. "webview-0.3.0" while the native tracker version might be something like "ios-6.1.0"). | Argument | Description | Required? | | ------------ | ----------------------------------------------------------------------------------------- | --------- | | `properties` | Event properties that are ["baked-in"](/docs/fundamentals/canonical-event/#common-fields) | Yes | | `event` | An optional self-describing event, with `schema` and `data` properties | No | | `context` | List of context entities as self-describing JSON | No | | Event type | `eventName` | | -------------------------- | -------------- | | Page view | `pv` | | Page ping | `pp` | | Structured | `se` | | Ecommerce transaction | `tr` | | Ecommerce transaction item | `ti` | | All other events | `ue` (default) | **Installed using npm:** ```javascript trackWebViewEvent({ properties: { eventName: 'pp', trackerVersion: 'webview-0.3.0', url: 'https://test.com/test', referrer: 'https://test.com', }, }); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackWebViewEvent', { properties: { eventName: 'pp', trackerVersion: 'webview-0.3.0', url: 'https://test.com/test', referrer: 'https://test.com', }, }); ``` *** ## Adding context entities You can add a list of entities to any event. In this example, two entities are added. **Installed using npm:** ```javascript trackScreenView({ id: '2c295365-eae9-4243-a3ee-5c4b7baccc8f', context: [ { schema: 'iglu:com.my_company/movie_poster/jsonschema/1-0-0', data: { movie_name: 'Solaris', poster_country: 'JP', poster_date: '1978-01-01', }, }, { schema: 'iglu:com.my_company/customer/jsonschema/1-0-0', data: { p_buy: 0.23, segment: 'young_adult', }, }, ], }); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackScreenView', { id: '2c295365-eae9-4243-a3ee-5c4b7baccc8f', context: [ { schema: 'iglu:com.my_company/movie_poster/jsonschema/1-0-0', data: { movie_name: 'Solaris', poster_country: 'JP', poster_date: '1978-01-01', }, }, { schema: 'iglu:com.my_company/customer/jsonschema/1-0-0', data: { p_buy: 0.23, segment: 'young_adult', }, }, ], }); ``` *** ## Specifying a tracker namespace You can specify tracker namespaces for the event. If not specified, the default tracker will be used. All of the `trackXYZ()` methods accept two arguments: | Argument | Description | Required? | | ---------- | ------------------------------------------------------------------------------------------- | --------- | | `event` | Event body, depends on the event being tracked | Yes | | `trackers` | Optional list of tracker namespaces to track the event with (undefined for default tracker) | No | For instance, the following tracks a page view event using a tracker initialized with the namespace `ns1`: **Installed using npm:** ```javascript trackPageView({}, ['ns1']); ``` **Installed using Snowplow tag:** ```javascript window.snowplow('trackPageView', {}, ['ns1']); ``` *** > **Warning:** If you specify namespaces in the configuration, and no mobile trackers actually exist with those namespaces, the event will be lost. --- # Verify schema dependencies with the Data Structures CI tool > Integrate Data Structures CI into your deployment pipelines to verify schema dependencies are deployed before code deployment. Use as a GitHub Action or with other CI/CD tools. > Source: https://docs.snowplow.io/docs/testing/data-structures-ci-tool/ The Data Structures CI is a command-line tool which integrates Data Structures API into your CI/CD pipelines and currently has one task which verifies that all schema dependencies for a project are already deployed into a specified environment (e.g. "DEV", "PROD"). This is available as a [Github Action](#setting-up-as-a-github-action) and as a [universal install for other deployment pipelines](#setting-up-for-other-deployment-pipelines) e.g. Travis CI, CircleCI, Gitlab, Azure Pipelines, Jenkins… ## Authorization In order to be able to perform tasks with the tool, you will need to supply both your Organization ID and an API key. You can find your Organization ID [on the _Manage organization_ page](https://console.snowplowanalytics.com/settings) in Console. An API Key can be created [here](https://console.snowplowanalytics.com/credentials). ## Create your manifest file This command allows you to verify that all schema dependencies for a project (declared in a specific "manifest") are already deployed into an environment (e.g. "DEV", "PROD"). In your application project, create a JSON file for your manifest that will store references to the schema dependencies you have for your project. During a CI build this file will be parsed, validated and used by Data Structures CI to check that each schema is correctly deployed to the appropriate environment before the code for the application gets deployed, effectively guarding against the 'Schema not found' type of [failed events](/docs/fundamentals/failed-events/). Here is an example manifest file where our application has dependencies on three schemas: - `checkout_process` version `1-0-7` - `user` version `1-0-1` - `product` version `2-0-0` ```json { "schema": "iglu:com.snowplowanalytics.insights/data_structures_dependencies/jsonschema/1-0-0", "data": { "schemas": [ { "vendor": "com.acme.marketing", "name": "checkout_process", "format": "jsonschema", "version": "1-0-7" }, { "vendor": "com.acme", "name": "user", "format": "jsonschema", "version": "1-0-1" }, { "vendor": "com.acme", "name": "product", "format": "jsonschema", "version": "2-0-0" } ] } } ``` The manifest must adhere to this [self-describing JSON Schema](http://iglucentral.com/schemas/com.snowplowanalytics.insights/data_structures_dependencies/jsonschema/1-0-0). ## Setting up as a Github Action To use the Github Action simply add this snippet as a step on your existing GitHub Actions pipeline, replacing the relevant variables: ```yaml name: Example workflow using Snowplow's Data Structures CI on: push jobs: data-structures-check: runs-on: ubuntu-latest steps: - uses: actions/checkout@master - name: Run Snowplow's Data Structures CI uses: snowplow-product/msc-schema-ci-action/check@v1 with: organization-id: ${{ secrets.SNOWPLOW_ORG_ID }} api-key: ${{ secrets.SNOWPLOW_API_KEY }} manifest-path: 'snowplow-schemas.json' environment: ${{ env.ENVIRONMENT }} ``` View the [Github Action repository](https://github.com/snowplow-product/msc-schema-ci-action). ## Setting up for other deployment pipelines ### Prerequisites - JRE 8 or above ### Download the CI tool You can download Data Structures CI from our Bintray repository, using the following command: ```bash $ curl -L https://github.com/snowplow-product/msc-schema-ci-tool/releases/download/1.0.0/data_structures_ci_1.0.0.zip | jar xv && chmod +x ./data-structures-ci ``` ### Run the task You can run the task using the following syntax: ```bash $ export ORGANIZATION_ID=<organization-id> $ export API_KEY=<api-key> $ ./data-structures-ci check \ --manifestPath /path/to/snowplow-schemas.json \ --environment DEV ``` View the repository for [integration examples](https://github.com/snowplow-product/msc-schema-api-examples/). --- # Test your Snowplow tracking and pipeline implementation > Test and validate your Snowplow tracking implementation using Snowplow Inspector or Micro. Run automated tests and QA your pipeline before deploying to production environments. > Source: https://docs.snowplow.io/docs/testing/ There are a number of ways you can test your tracking implementation and QA your pipeline to follow good data practices. When implementing new tracking, or when making changes to your schemas or enrichments, we recommend you run testing by sending events to a test environment before deploying your changes to production environments. You can use the [Snowplow Inspector](/docs/testing/snowplow-inspector/) or [Snowplow Micro](/docs/testing/snowplow-micro/) to do this. ## Test web tracking using the browser extension Use the [Snowplow Inspector](/docs/testing/snowplow-inspector/) browser extension to validate your web tracking implementation. ## Test schema and enrichment changes with Snowplow Micro [Snowplow Micro](/docs/testing/snowplow-micro/) is a lightweight version of a Snowplow pipeline that you can use to validate that your schemas, tracking code and enrichments work as expected. Micro can be deployed [through Console](/docs/testing/snowplow-micro/console/) or [locally](/docs/testing/snowplow-micro/local/) using Docker or Java. Micro can also be used for [automated testing](/docs/testing/snowplow-micro/automated-testing/) in a CI/CD environment. --- # Add custom schemas to Snowplow Inspector > Configure Snowplow Inspector to access custom schemas from local registries, Data Structures API, Iglu Server, or static repositories. Validate events against private schemas by logging into Console or manually adding registries. > Source: https://docs.snowplow.io/docs/testing/snowplow-inspector/adding-schemas/ By default, the extension is able to see [Iglu schemas](/docs/fundamentals/schemas/) for events and entities generated by the standard Snowplow [tracker](/docs/sources/) and plugin APIs, as they're publicly available on the public [Iglu Central](/docs/api-reference/iglu/iglu-repositories/iglu-central/) registry. The Iglu Central registry is a default configuration in all Snowplow Pipelines as a common base, so events validated in the Inspector against these schemas should also validate correctly in your pipeline. For custom schemas that aren't publicly available in Iglu Central, you need to configure the extension to access your registries so it can access the schemas to perform validation. If you don't log in or configure registries, the extension won't recognize or be able to validate any custom events/entities, though they will still display. ![The Snowplow Inspector extension displaying a Self Describing Event called \"example\_event\". It contains a property \"example\_field\_1\" with the value \"abc\". The event\'s schema displays as Unrecognized in any registries.](/assets/images/unrecognized-event-41abdf23415d71f1454e56092ae757df.png) Snowplow customers with access to the Snowplow Console can log in using the button in the bottom-left of the extension, which will automatically discover schemas defined for any [data structures](/docs/event-studio/data-structures/) set up in your development or production Iglu Server environments. When installed, the extension also creates a local Iglu registry, which lets you iterate on schemas without having to deploy them to a external registry. You can create any number of additional local registries to help organize your work. You can also configure the extension to access other private schema repositories (such as [Iglu Servers](/docs/api-reference/iglu/iglu-repositories/iglu-server/) or [static repositories](/docs/api-reference/iglu/iglu-repositories/static-repo/)). ## Manage schemas and registries To access the full list of registries configured and find which schemas are accessible, choose **...** > **Manage Schemas**. ![Screenshot of the Snowplow Inspector\'s \"More Options\" menu open and the \"Manage Schemas\" option selected.](/assets/images/access-manage-schemas-4bb3ef87b3c56471b369b0e932587600.png) The extension will connect to your registries and request a listing of all available schemas. ![Screenshot of the Snowplow Inspector\'s \"Manage Schemas\" page with the default Local and Iglu Central registries configured.](/assets/images/manage-schemas-5898f2f1952b3c1c724c23de0ae11093.png) At the top is a search for filtering the shown schemas, controls for managing registries and local schemas, and then a list of your configured registries. Underneath is a tree listing of all schemas configured, grouped by vendor, name, format, and version. Each version has a list of the registries that contained it. Clicking a registry will toggle selecting the registry. The schema list below will filter to only show schemas within selected registries. If there are no registries selected, schemas from all registries will display. Any search terms will also only search schemas from the selected registries. Using the **Registries...** button you can: - Add more custom registries - Import a list of registries from an existing [Iglu Resolver configuration](/docs/api-reference/iglu/iglu-resolver/) If you have some registries selected, this button will also allow you to: - Edit existing registries you've previously configured - Remove configured registries so they aren't accessible to the extension any more Selecting a local registry will activate the **Schemas...** button and allow you to manage the schemas it contains. You can clear the search and registry selections to see all schemas again using the **Clear Filters** button. ## Add custom registries You can add or edit different types of registry: - Local - Data structures API - Iglu Server - Static Configuring a registry requires a name and one of these types. You may also optionally include a priority, and list of vendor prefixes. Each type of registry may require additional configuration like API keys to function. Once configured, schemas from the registry load in the **Manage Schemas** view, and events using those schemas get validated against them. Once the extension can query your private Iglu repositories it will be able to tell you in real time whether your events and entities are failing validation from entirely within the browser. Here's an example of an event passing validation: ![The Snowplow Inspector extension displaying a Self Describing Event called \"example\_event\". The event payload contains a \"example\_field\_1\" property with the value \"abc\". The extension says the event is valid against a schema from the Local Registry, signalled by a checkmark.](/assets/images/event-passing-validation-9f31ef8956d51292f512f251bb9965c3.png) Here's an example of an event failing validation. The failure reason is that the schema defines `example_field_1` as "string" type, meaning it can't be null. ![The Snowplow Inspector extension displaying a Self Describing Event called \"example\_event\". The event payload contains a \"example\_field\_1\" property with a null value. The extension says the event is invalid against a schema from the Local Registry, signalled by a red cross.](/assets/images/event-validation-failed-8495885c08522e78ffee0b9457d09938.png) Hovering over the registry displays text describing which aspect of the payload failed the validation, which is also available in the **Errors** tab. Find the **Errors** tab next to the **Data**, **JSON**, and **Schema** tabs within the event details pane. Clicking the registry copies this error to your clipboard. ### Local Local registries are often used for testing changes of schemas still in development, or quickly including a schema when you don't have the full access required to configure a registry. The extension UI lets you manage the schemas, which it stores on your local machine. The only required configuration is a name, and the schemas themselves. You manage the schemas by selecting the registry and using the **Schemas...** button. You can manually edit the JSON in the extension, or bulk import JSON files from a directory on your machine. ### Data structures API Data structures API registries use the [data structures API](/docs/event-studio/programmatic-management/data-structures-api/) included in Console. Most users can just log in to [Console](https://console.snowplowanalytics.com) via the extension and it will import registries for any Snowplow Console organizations your account has access to. If you need to provide access for someone that doesn't have a Console account, they can manually configure one using a Console API key. In order to function, the extension requires: - Organization ID: find this under **Settings** > **Manage organization** in Console - API Key ID: go to **Settings** > **Manage organization** in Console to manage your API keys - API Key: go to **Settings** > **Manage organization** in Console to manage your API keys ### Iglu Server [Iglu Server](/docs/api-reference/iglu/iglu-repositories/iglu-server/) is a more full-featured dedicated service for hosting server that's more flexible than static registries. You may need to add one of these registries if you are accessing the registry built into [Snowplow Mini](/docs/api-reference/snowplow-mini/). To authenticate with your server, the extension will require: - Iglu API endpoint: this is the base URL the extension will use when contacting the API. If you include a path component, the API request will be `api/*` relative to this path; you may need to add or remove trailing slashes if the API isn't hosted at the root. - Iglu API key: see [API keys and the authentication service](/docs/api-reference/iglu/iglu-repositories/iglu-server/#5-api-keys-and-the-authentication-service-apiauth) for instructions on generating an Iglu API key. When logged into Console, these details should be available in **Manage organization** > **API keys for utilities**. ### Static Most other registries, such as those hosted as websites or via [S3](https://aws.amazon.com/s3/) or [GCS](https://cloud.google.com/products/storage/) buckets will use this [static](/docs/api-reference/iglu/iglu-repositories/static-repo/) type. If you can access the schemas directly via a browser, this is probably the right choice. Static registries managed with [`igluctl`](/docs/api-reference/iglu/igluctl-2/) version 8 or newer will include a manifest file listing all the schemas in a registry, e.g. [the Iglu Central manifest](https://iglucentral.com/schemas). If the registry doesn't include a manifest listing the contained schemas, the extension will attempt to request all referenced schemas from static registries to see if it's included. In this case the registries will also not have their schemas listed in **Manage Schemas** - unless they're referenced by tracked events since you last opened the extension. To connect to your static registry, the extension will need a **base URI** to use when requesting schemas. The extension will request schemas relative to this base, expecting a `schemas/{vendor}/{name}/{format/{version}` format. Optionally you can also provide a **manifest URI** to use for the schema manifest file that lists the schemas contained in the registry. If not provided, defaults to `schemas`, to match the `igluctl` default. The manifest file is also known as a "schema list", as described in the [release notes](https://github.com/snowplow/igluctl/releases/tag/0.8.0). --- # Import events into Snowplow Inspector from multiple sources > Import events into Inspector from HAR files, failed events, ElasticSearch, ngrok tunnels, or remote debugging sessions. View and debug events from mobile devices and other sources as if generated locally. > Source: https://docs.snowplow.io/docs/testing/snowplow-inspector/importing-events/ Within the Snowplow Inspector, the main **Events** view allows you to import events from other devices that you can view, as if your own browser generated them. Import events via the **Import Events** icon button at the top of the events list. The extension supports several file types. ## Importing from HAR files [HAR](https://en.wikipedia.org/wiki/HAR_$file_format$) files are JSON representations of HTTP sessions. They include a list of requests and responses made to servers by a single client, including all the metadata such as headers. You can export the current contents of the **Network** panel in your DevTools to a HAR file, and then later load them into the extension as a record that events fired and successfully validated, or failed to validate. This can be useful for troubleshooting, or as a record of the QA process. Aside from the browser, other tools like [Charles Proxy](https://www.charlesproxy.com/) or [Fiddler](https://www.telerik.com/fiddler), commonly used as proxies for verifying analytics from mobile applications, can also export to this format. ## Importing failed events You can import historical events that have already failed the enrichment process into the extension to allow you to easily find the errors with the events. The extension will accept an uncompressed file, which you can paste a selection from, or in total straight from your clipboard. ## Importing events from ElasticSearch / OpenSearch If you use [ElasticSearch](https://www.elastic.co/) / [OpenSearch](https://opensearch.org/) as a destination for your events (or as used in [Snowplow Mini](/docs/api-reference/snowplow-mini/)) you can specify a query to use and the extension will load events as they're indexed. This can be useful for testing many devices at once, e.g. multiple mobile devices that are all sending events to your Snowplow Mini instance. ## Importing events from an ngrok tunnel [ngrok](https://ngrok.com/) is a service for creating ad-hoc network endpoints that can accept / tunnel requests and offers an API for other services to introspect and act on the requests that it received. Using ngrok, you can create an endpoint, use that endpoint as a Collector destination for your tracking, and then examine any events sent to it via the extension. When you attempt to import from ngrok, the extension will attempt to connect to the ngrok tool running on your local machine on port 4040 (`localhost:4040`) to access the [ngrok Agent API](https://ngrok.com/docs/ngrok-agent/api/). If successful, any Snowplow events sent to the corresponding tunnel endpoint will appear in the extension. ## Remote debugging with Chrome DevTools You can [remotely inspect](https://developer.chrome.com/docs/devtools/remote-debugging/) some non-web applications using the [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/). This includes some mobile browsers, some mobile applications, WebViews on Android, [Chromium Embedded Framework](https://github.com/chromiumembedded/cef) applications, and more. You may need to access the remote DevTools via the URL `chrome://inspect/#devices` to access DevTools with Snowplow Inspector and other browser extensions active. Network requests made on the remote device should appear in the extension as usual. --- # Debug web tracking with the Snowplow Inspector browser extension > Validate and debug Snowplow web tracking using the Inspector browser extension. View events, attributes, and interventions in real-time with schema validation support for Chrome, Edge, and Firefox. > Source: https://docs.snowplow.io/docs/testing/snowplow-inspector/ Snowplow recommends using the Snowplow Inspector browser extension for validating your tracking code. To install in Chrome, Edge and other Chromium-based browsers, find [Snowplow Inspector on the chrome web store](https://chromewebstore.google.com/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm). > **Note:** Firefox users can install the extension manually using an `.xpi` file from the [GitHub Releases page](https://github.com/snowplow/chrome-snowplow-inspector/releases). > > The Snowplow Inspector is not supported on Safari at this time. The code for the Snowplow Inspector is available on [GitHub](https://github.com/snowplow/chrome-snowplow-inspector). Once you add the extension to your browser, you can view it by [opening Developer Tools](https://developer.chrome.com/docs/devtools/open/), where it has its own tab named **Snowplow** (look for the tab adjacent to **Console**, **Network**, etc. It may be necessary to expand the list of tabs using the `»` button). > **Tip:** Common keyboard shortcuts to open Developer Tools include: > > - PC: `Ctrl`+`Shift`+`I` > - Mac: `Cmd`+`Option`+`I` ## How it works As you browse a webpage you can perform different actions that you can track as Snowplow events. These events get sent as HTTP requests to the Snowplow Collector that's configured for the tracker on each webpage. The Snowplow Inspector extension observes and parses these HTTP requests for display in a more easily readable format. This allows anyone with the extension to more easily see which actions on a webpage trigger Snowplow events to send to a Snowplow pipeline. You can debug standard Snowplow events, and [Signals attributes and interventions](/docs/testing/snowplow-inspector/signals-integration/) using the Inspector. Choose which data type you want to look at using the vertical tabs on the left-hand side of the extension: **Events**, **Attributes**, and **Interventions**. The **Events** tab shows the usual event debugging functionality. The inspector shows a list of all the events received in the order they were fired, with more recent events at the top. Click on each event to see its details. In the example below, events fired as HTTP POST requests from the Snowplow homepage: - An automatic [self-describing](/docs/fundamentals/events/#self-describing-events) [`web_vitals`](/docs/events/ootb-data/app-performance/#web-vitals) event. - A [`page_view`](/docs/fundamentals/canonical-event/#page-views) event. - A second `page_view` event from an embedded iframe on the page. It has a blue dot because its App ID and Collector combination are distinct from the other events, which have red dots because they share an App ID/Collector. - A self-describing [`link_click`](/docs/sources/web-trackers/tracking-events/link-click/) event. - Two [`page_ping`](/docs/events/ootb-data/page-activity-tracking/#page-engagement) events. ![A screenshot of the Snowplow Inspector extension listing several events observed on the current pageview. A selected page-view event displays, detailing the properties collected as a part of that event.](/assets/images/using-poplin-chrome-extension-d7c9abdb27e54813634c7b5985a54cf1.png) Within the parsed HTTP request bodies that contain event payloads are a variety of [Snowplow canonical event](/docs/fundamentals/canonical-event/) fields, such as unique `event_id`, timestamps, user and session identifiers, as well as any custom [event](/docs/fundamentals/events/#self-describing-events) or [entity](/docs/fundamentals/entities/) fields. ## Try it out Once you've installed the extension, you can try it out immediately by visiting a webpage with Snowplow tracking. For example, the [Snowplow website](https://snowplow.io/), or this documentation site. You don't need a Snowplow account to use the extension. ## Use in event QA The Snowplow Inspector extension is useful for debugging any web tracking implementation. For questions such as "why is X event not appearing in the data warehouse", the extension allows you to see if the event is actually firing. Additionally, you can configure the extension to show whether or not an event has passed validation according to any event validation rules codified in the corresponding [schema](/docs/fundamentals/schemas/). For events that failed validation in production historically that you are unable to replicate in your own browser, see the guides on [how to query failed events](/docs/monitoring/exploring-failed-events/file-storage/). These failed events have a [specific format](/docs/fundamentals/failed-events/) that includes an array of helpful, detailed error messages that explain the exact reasons why the event failed validation. You can also [import bad events](/docs/testing/snowplow-inspector/importing-events/#importing-failed-events) into the extension to view as if your browser had generated them itself. --- # Debug Signals attributes and interventions in Snowplow Inspector > Connect Snowplow Inspector to Signals to validate attribute and intervention definitions. View real-time attribute values and monitor triggered interventions using the browser extension. > Source: https://docs.snowplow.io/docs/testing/snowplow-inspector/signals-integration/ The Snowplow Inspector optionally integrates with [Snowplow Signals](/docs/signals/introduction/) to help validate your [attribute](/docs/signals/attributes/attributes/) and [intervention](/docs/signals/concepts/#interventions) definitions. There are two ways to connect Inspector to Signals: - Automatic: log into [Snowplow Console](https://console.snowplowanalytics.com) via the extension. The extension will detect Signals instances for any organizations you have access to. For full functionality you will need to add API credentials for each organization in the extension options. - Signals Sandbox: if you're trialing Signals using the [Signals Sandbox](/tutorials/signals-interventions/start/#signals-sandbox), you can enter your Profiles API and credentials information in the extension options. To access the extension options, use your browser toolbar to find the extensions menu, and access the **Options** for the Snowplow Inspector extension. ![Screenshot of the Chromium browser with the extension menu open. The installed Snowplow Inspector has its Options menu item selected.](/assets/images/extension-options-28e909a5bd99292f37454eb2683846fe.png) After connecting to Signals, the Inspector will use the Signals API to discover any defined [attribute keys](/docs/signals/concepts/#attribute-keys), and the definitions for your attribute groups. It uses these definitions with any events it has observed in the debugger to build a set of attribute keys. By default Signals includes `domain_userid` as an attribute key. Once the extension is aware of this, it will look at any events it observes, and build a list of unique `domain_userid` values it finds. It then uses this list of attribute keys for requesting [attributes](#attributes) and subscribing to [interventions](#interventions). ## Signals functionality You can access Signals features by switching the vertical tabs down the left-hand side of the extension. The three tabs are: **Events**, **Attributes**, and **Interventions**. **Events** is the usual event debugging functionality, and the other two interact with the Signals APIs. ### Attributes When you switch to the **Attributes** page in the extension, it will combine the attribute group definitions from the Signals API with the observed attribute keys, and then request the current profile values for each attribute key. ![Screenshot of the Snowplow Inspector viewing attribute values from a \"demo\_agentic\" attribute group. The values are for a specific \"domain\_userid\" attribute key. The tabs show 20 events counted as well as 1 intervention.](/assets/images/signals-attributes-450c951be5e6fbab765dc2f12c16d495.png) While you stay on the **Attributes** page, any new events that occur in the background (and increase the event counter on the **Events** tab) will automatically trigger a refresh. If there are no new events after 5 seconds, it will also do a final refresh, in case there was any latency in processing events and updating the values. Switching to the **Events** or **Interventions** tab and then back to the **Attributes** tab will initially show the most recently fetched values, while requesting any new updates in the background. If the attribute values have still not updated or look out of date, you can use the refresh button at the top of the screen to manually trigger a refresh. > **Note:** Since Signals Sandbox requests are free, attribute values get requested much more frequently than described here, to update as quickly as possible. All attribute groups from all Signals instances get fetched if their configured attribute key has any values extracted from events. If you have access to many organizations, this may slow down the retrieval process. You can filter by environment, organization, or attribute source type to limit what's requested and displayed. The controls at the top allow you to filter or search to filter what's displayed. Each attribute group listing includes its organization, environment, version, and source in the top-right for reference. > **Note:** If you only have access to one organization, organization filters don't display. Similarly, if there is only a single version of an attribute group, there will be no option to change it. > > If configured, Signals Sandbox counts as its own organization. By default, the Inspector only fetches the highest version of each attribute group. If an attribute group has more than one version available, you can switch to an older version you're interested in to request that instead. You can click a displayed attribute row to show the JSON definition for that attribute. Click the same row again to hide the definition. ### Interventions As the extension builds its list of attribute keys based on observed events, it will automatically start intervention subscriptions for each attribute key found. If an intervention triggers on a subscription, the interventions counter will increase and you can switch to this tab to view the intervention content of each observed intervention. Clicking an intervention from the list of received interventions on the will display the contents, including the targeted attribute key and any associated attributes at the time of the intervention. For rule-based interventions, you can also expand the definition of the intervention to see the logic used to trigger it. ![Screenshot of the Snowplow Inspector viewing interventions. The most recent of two triggered interventions is displayed.](/assets/images/signals-attributes-450c951be5e6fbab765dc2f12c16d495.png) --- # Set up automated testing with Snowplow Micro > Integrate Snowplow Micro with automated testing frameworks like Nightwatch and Cypress. Build end-to-end GitHub Actions workflows to validate tracking implementations with custom commands and assertions. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/automated-testing/ The basic approach for using Snowplow Micro in automated testing is this: - Run it alongside your tests, either via Docker or Java - In each test, send some events then validate the results using the [Micro REST API](/docs/api-reference/snowplow-micro/api/) > **Tip:** The instructions for running and configuring Micro in CI/CD are identical to the ones for [running locally](/docs/testing/snowplow-micro/local/). The [snowplow-micro-examples](https://github.com/snowplow-incubator/snowplow-micro-examples) repository shows how to set up automated tests using Nightwatch and Cypress as examples of test frameworks and how to build end-to-end GitHub Actions testing workflows. ## Local setup ### Prerequisites We recommend setting up the following tools before starting: - Git - Docker and Docker-compose - Npm ### Clone the repo, start Snowplow Micro and serve the app ```bash git clone https://github.com/snowplow-incubator/snowplow-micro-examples.git cd snowplow-micro-examples docker-compose up ``` This will: 1. Start serving the app on `localhost:8000` 2. Launch Snowplow Micro, mounting the `micro` directory and setting the port 9090 for accessing Micro's [REST API](/docs/api-reference/snowplow-micro/api/) endpoints. Inside the `micro` directory are: 1. The [configuration for Snowplow Micro](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/micro/micro.conf). 2. The [configuration for Iglu resolvers](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/micro/iglu.json). 3. The `iglu-client-embedded` directory containing the custom schemas used for tracking. ### Install npm dependencies ```bash npm install ``` This step will install Nightwatch and Cypress. ### Run the tests Our demo web app uses snowplow to track user activity. We will be testing that tracking is properly configured on our site. To run all tests: ```bash npm test ``` For running only [Nightwatch](https://nightwatchjs.org/) tests: ```bash npm run test:nightwatch ``` For running only [Cypress](https://www.cypress.io/) tests: ```bash # From the command line npm run cypress:run # To launch the Test Runner npm run cypress:open ``` ## Github Actions Inside the `.github/workflows/` directory you can find the `.yml` files we use to test this exaple app with Micro and Nightwatch/Cypress. A general workflow file would definitely use the [Snowplow Micro](https://github.com/snowplow/snowplow-micro) step, which for our example is: ```yaml - name: Start Micro run: docker-compose up -d working-directory: snowplow-micro-examples ``` In order to use it, just make sure that: 1. Your `working-directory` contains the `micro/` directory that Micro will mount, containing a `micro.conf` and a `iglu.json` configuration for Micro and Iglu respectively. 2. Your `docker-compose.yml` file is adjusted accordingly If you wanted to use `docker run` instead of `docker-compose`, the same step would be: ```yaml - name: Start Micro run: docker run --mount type=bind,source=$(pwd)/micro,destination=/config -p 9090:9090 snowplow/snowplow-micro:4.1.1 --collector-config /config/micro.conf --iglu /config/iglu.json & \ working-directory: snowplow-micro-examples ``` ## Tracking design Our demo web app uses Snowplow to track user activity. Then, using Snowplow Micro we will be testing that tracking is properly configured on our site using, as examples, two popular web test tools, Nightwatch and Cypress. ### Overview of the demo app The example application is a simple ecommerce app consisting of just 3 pages: 1. A start-up page with a sample login form (no authentication involved - see also the **Note** below) 2. The shop page including a same-page cart 3. The purchase-confirmation page > **Note:** The form of the login-page is only for demonstrating the tracking of form events (which can also show that password fields are not being tracked). There is no authentication involved. While you can type anything and continue to the shop-page, we recommend that you do not use any personal data. Each page serves the purpose of demonstrating possible event-tracking, which will then be tested using Snowplow Micro and a test tool. ### Event dictionary Using the [JavaScript Tracker](/docs/sources/web-trackers/): ```html  <script type="text/javascript"> ;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window, document, "script", "{% static 'ecommerce/js/sp.js' %}", "snowplow")); ``` The tracking implemented consists of: - Pageview tracking - This event happens when a user visits a page. - It is a predefined Snowplow event, that automatically captures the URL, referrer and page title. ```javascript window.snowplow('trackPageView'); ``` - Activity Tracking - This event happens when a user engages with a page (e.g. user scrolls). - It is a predefined Snowplow event, that automatically records the maximum scroll left-right, up-down in the last ping period. - Method call - before the trackPageView method call: ```javascript // login-page window.snowplow('enableActivityTracking', { minimumVisitLength: 20, heartbeatDelay: 20 }); // shop-page: window.snowplow('enableActivityTracking', { minimumVisitLength: 10, heartbeatDelay: 10 }); ``` - Form Tracking - This set of events happens when a user interacts with a web form (e.g. focus on an form element, change a value of input-textarea-select element form, submit a form) - These are predefined Snowplow events that are of unstructured eventType and can be customized. Data captured: - **focus\_form**: id, classes of the form and name, type, value of the form element that received focus. - **change\_form**: name, type, new value of the element, id of the parent form - **submit\_form**: id, classes of the form and name, type, value of all form elements - **Note**: _By default, Form Tracking does not track password fields. We used the `options` to demonstrate how you can ensure the Non-tracking of sensitive fields. With Snowplow Micro we can also later test that any denylisting of forms is implemented correctly and that no sensitive fields are tracked._ ```javascript var options = { forms: { denylist: [] }, fields: { denylist: ['user_password'] } }; window.snowplow('enableFormTracking', { options: options }); ``` - Tracking custom self-describing(unstructured) events 1. cart-events ([schema](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/micro/iglu-client-embedded/schemas/test.example.iglu/cart_action_event/jsonschema/1-0-0)) - These events happen when a user interacts with the cart, adding or removing items, using the Add-to-cart or Remove buttons. - This is a self-describing event that captures the type of cart interaction: "add" versus "remove". - We also want to add as [custom context](/docs/sources/web-trackers/custom-tracking-using-schemas/#track-a-custom-entity) the product involved in the cart-event, which is described by the product entity ([schema](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/micro/iglu-client-embedded/schemas/test.example.iglu/product_entity/jsonschema/1-0-0), see more below) - Implemented in the shop-page (see file [shoppage.js](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/app/static/ecommerce/js/shoppage.js)): ```javascript // TRACK cart_action_event (add) window.snowplow('trackSelfDescribingEvent', { event: { schema: 'iglu:test.example.iglu/cart_action_event/jsonschema/1-0-0', data: { type: 'add', }, }, context: [ { schema: 'iglu:test.example.iglu/product_entity/jsonschema/1-0-0', data: { sku: sku, name: title, price: parseFloat(price), quantity: parseInt(quantity) } } ] }); // TRACK cart_action_event (remove) window.snowplow('trackSelfDescribingEvent', { event: { schema: 'iglu:test.example.iglu/cart_action_event/jsonschema/1-0-0', data: { type: 'remove', }, }, context: [ { schema: 'iglu:test.example.iglu/product_entity/jsonschema/1-0-0', data: { sku: sku, name: title, price: parseFloat(price), quantity: parseInt(quantity) } } ] }); ``` - purchase-event ([schema](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/micro/iglu-client-embedded/schemas/test.example.iglu/purchase_event/jsonschema/1-0-0)) 1. - These events happen when a user completes the purchase of the products in their cart by clicking the Purchase button. - This is a self-describing event that captures the total amount of the transaction. - We also want to add as custom contexts the products involved, each of which is described by the product entity ([schema](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/micro/iglu-client-embedded/schemas/test.example.iglu/product_entity/jsonschema/1-0-0)) - Implemented in the shop-page as well ```javascript // create the contexts array let productsContext = []; userCart.forEach(function(elt) { productsContext.push({ schema: 'iglu:test.example.iglu/product_entity/jsonschema/1-0-0', data: { sku: elt.itemSku, name: elt.itemTitle, price: parseFloat(elt.itemPrice), quantity: parseInt(elt.itemQuant) } }); }); // TRACK purchase_event window.snowplow('trackSelfDescribingEvent', { event: { schema: 'iglu:test.example.iglu/purchase_event/jsonschema/1-0-0', data: { total: parseFloat(total), }, }, context: productsContext }); ``` - `webPage` [predefined Context](/docs/sources/web-trackers/tracker-setup/initialization-options/). Note: The webPage predefined context is enabled by default in JavaScript Tracker v3, as it is also a prerequisite for the official [Snowplow web data model](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/legacy/dbt-web-data-model/). ## Testing with Snowplow Micro The purpose of this repo is to show how Snowplow Micro can be used to validate that the tracking implemented in a demo app is operating in the way we expect. Simulate particular situations, and check that: - There are no bad events. - The data sent is as expected. - The right event data is sent with each event. - The right entities / contexts are attached to each event. - The right values are sent with each event. In this repo you can also find information on how to set up trackers, how to make customised (unstructured) events, and, mainly, how to configure and add tests in your test tool of choice, using as examples the popular web test tools Nightwatch and Cypress, in order to demonstate the capabilities of Snowplow Micro. ### The tests implemented 1. No bad events 2. Number of good events = number of expected good events 3. Ensuring that total number of events is right - expected number of good events + bad events (noBadEvents = True) 4. Checking proper values of structured and unstructured events are sent to Micro 5. Event With Property test - context, event type and schema - user puts in an event and we match by all 3 conditions (contexts, properties and schema - this determines whether or not the fake test event is equal to the one on micro - for both structured and unstructured) 6. Race condition test - to ensure that event x is always sent to Micro before event y (in our case we wanted to ensure cart action occurred before purchase) 7. Form tracking test to ensure blacklisted form fields are not tracked, and to ensure the right properties are sent for each field ## Snowplow Micro and Nightwatch Powered by Node.js, [Nightwatch.js](https://nightwatchjs.org/) is an open-source automated testing framework that aims at providing complete E2E (end to end) solutions to automate testing with Selenium Javascript for web-based applications, browser applications, and websites. This framework relies on Selenium and provides several commands and assertions within the framework to perform operations on the DOM elements. ### How does Nightwatch JS work? Nightwatch communicates over a restful API protocol that is defined by the W3C WebDriver API. It needs a restful HTTP API with a Selenium JavaScript WebDriver server. In order to perform any operation i.e. either a command or assertion, Nightwatch usually requires sending a minimum of two requests. It works as follows: - The first request locates the required element with the given XPath expression or CSS selector. - The second request takes the element and performs the actual operation of command or assertion. ### Getting Started If you want to isolate Nightwatch JS testing, follow these steps below. Install nightwatch and a chrome driver which enables nightwatch to interact with the Chrome browser: ```bash npm install nightwatch --save-dev npm install chromedriver --save-dev ``` 1. Create default json package 2. Create node modules folder and add nightwatch to dev dependencies 3. Add nightwatch.conf.js file (must contain basic configuration file) 4. Add a chrome driver binary for nightwatch configs - adds it into node modules - allows us to send commands to the chrome driver 5. Add `test:nightwatch` to the scripts section of the package.json 6. Write your own tests Running nightwatch: ```bash npm run test:nightwatch ``` ### Simulating a User with Nightwatch When using Nightwatch, one testing strategy is to simluate a user interaction with your app. This is useful so that we can fire the exact events that a user would fire when using the app in the field. In Nightwatch a test involves 3 phases: 1. Preparing a state: 1. Reset Micro command - deleting the cache in Micro so that each test has independent events from other tests 2. Configure Nightwatch to interact with your app 2. Taking an action: Nightwatch interacts with the app creating events 3. Making an assertion on the resulting state: Nighwatch requests the state of Micro and based on a test performs the corresponding assertions ### Organisation of tests #### Commands ##### resetMicro The resetMicro command can be added before each test as the beforeEach hook: [Nightwatch Test Hooks](https://github.com/nightwatchjs/nightwatch-docs/blob/643e9a63a94deba6bd84bf2dd78cb27693620e20/guide/using-nightwatch/using-test-hooks.md) _Example call_: ```javascript module.exports = { beforeEach: function(browser) { browser .resetMicro(); }, "test1" :function (browser0 { // code here } } ``` _Arguments_: None This command utilises the ability to reset Micro through the `/micro/reset` endpoint. This is the general structure of how to create a command which aims to request information from Snowplow Micro using Nightwatch: ```javascript this.command = (callback) => { const request = require('request'); request( { url: 'http://localhost:9090/micro/all', json: true }, (err, res, body) => { if (err) { console.warn(error); return false; } callback(body); } ); }; ``` #### Assertions ##### .noBadEvents _Example call_: ```javascript browser.assert.noBadEvents(); ``` _Arguments_: None This is arguably the most important assertion, if you only implement one this is a great place to start. It ensures that all of your data is sent to Micro and ends up in good events, so all of your data is in your warehouse and you can interpret it as expected. ##### .noOfGoodEvents _Example call_: ```javascript browser.assert.noOfGoodEvents(2); ``` _Arguments_: Number of expected good events to be sent to Micro An extension of noBadEvents, asserts that you sent the correct number of good events to Micro for a given event. For example, you might expect the onClick() action to send 2 good events, then you can make sure 2 are sent to good events and call noBadEvents on the same assertion as total number of events = number of good events + number of bad events. ##### .noOfTotalEvents _Example call_: ```javascript browser.assert.noOfTotalEvents(2); ``` _Arguments_: Number of expected total events sent to Micro A further extension on noOfGoodEvents, this assertion ensures that both the correct number events are sent to Micro, and that no bad events are sent. Using all 3 of these assertions consecutively provides the best assurance that every event you send is sent to your warehouse properly. ##### .orderOfEvents _Example call_: ````javascript browser.assert.orderOfEvents(events_list); ```javascript _Arguments_: events_list Checks for when one event must be before the other so that your application works as expected. This works by retrieving the derived timestamps for each event, and asserting that events fired in the specified order. In our case we use this to test that a cart action occurs before purchasing an item. If our application didn't get this order of events correct, then the application does not act as expected. This can also be considered a race condition test. ##### .successfulEvent _Example call_: ```javascript browser.assert.successfulEvent( { "eventType": "unstruct", "schema": "iglu:test.example.iglu/cart_action_event/jsonschema/1-0-0", "values": { "type": "remove" }, "contexts": [ { "schema": "iglu:test.example.iglu/product_entity/jsonschema/1-0-0", "data": { "name": "One-size summer hat", "price": 15.5, "quantity": 1 } } ] } ); ```` _Arguments_: test\_event (with schema, eventType, values and context), number\_of\_occurences Ensures that when an event is sent to Micro the correct parameters are sent as expected. In our case we check that the schema, properties and contexts are correct: - We match what the user expects to be sent to what is received on Micro. - In the end, the number of matched events is retrieved and asserted to the expected number of occurrences. - By doing that, we can also specify which events we don't expect on Micro by setting the argument number\_of\_occurences=0. ## Snowplow Micro and Cypress [Cypress](https://www.cypress.io/) is an open [source](https://github.com/cypress-io/cypress) JavaScript End-to-End testing framework with extensive [documentation](https://docs.cypress.io/). In this section we note few things that are specifically related to using this test tool with Snowplow Micro, describe the rationale behind the tests' organization used in this example and document the commands used. ### Introduction Generally, a test involves 3 phases: 1. Prepare a state 2. Take an action 3. Assert on the resulting state While Cypress considers as state the application's state, it is still a fact that an app is rarely an isolated system without side effects. This is especially true when a tracking strategy is implemented, which means that any action can fire [events](/docs/events/) through the [trackers](/docs/sources/). There is an increasing number of reports highlighting the importance of upstream data quality and Data Ops, which means that testing your Data Collection and trackers' implementation besides your product's features is of highest priority. Tracking is as important as your shipping, and that is why it is highly recommended that you include its testing in your E2E tests. Cypress, even though it considers as [best practice](https://docs.cypress.io/guides/references/best-practices.html#Visiting-external-sites) to avoid requiring or testing 3rd party services, it still offers the ability to "talk" to 3rd party API's via [cy.request()](https://docs.cypress.io/api/commands/request.html), that: - Entirely bypasses CORS - Expects the server to exist and provide a response - Does not retry its assertions (as that could affect external state) , which makes it great to use for querying Snowplow Micro's endpoints. For example: ```javascript cy.request({ url: 'http://localhost:9090/micro/all', json:true }); ``` So, following on the 3 test's phases: 1. Preparing state: 1. Reset Micro 2. Configure Cypress to visit your app 2. Actions: Cypress interacts with the app creating events 3. Assertions: Cypress sends requests to Micro, and attempts assertions on the responses ### Tests' organization Another Cypress' recommendation for best [practices](https://docs.cypress.io/guides/references/best-practices.html#Having-tests-rely-on-the-state-of-previous-tests) is the decoupling of tests, which, for the case of testing with Snowplow Micro, would mean to run both the state-changing and the micro-requests in the same spec file. However, there were some issues in doing so. More specifically, those issues had only to do with cases where links (or submit buttons) were clicked, in other words in cases where a window [unload event](https://developer.mozilla.org/en-US/docs/Web/API/Window/unload_event) was fired. To describe the issue, we first describe what normally happens upon an unload event: When a user clicks, for example, a link, on one hand the browser wants to navigate to the link, and on the other hand, the tracker (in our case the [Javascript Tracker](/docs/sources/web-trackers/)) tries to send the [link click](/docs/sources/web-trackers/tracking-events/link-click/) or the [submit form](/docs/sources/web-trackers/tracking-events/form-tracking/) events, while also storing them in local storage, just in case the events don't get sent before the page unloads. While it is normal for browsers to cancel all requests, a cancelled request does not necessarily mean that the request did not reach the server, but that the client sending it, does not wait for an answer anymore. So, there is no way to know from client side whether the request (be it POST or GET) succeeded. That problem was especially apparent when Micro was being queried in the same spec file with the app's actions. For example, POST requests appeared as cancelled in Cypress' test runner, but the events may have reached Micro. Taking advantage of the fact that Cypress also clears browser cache when it changes spec file, we decided to move the testing part of Micro into separate spec files. The consequences of this decision are: 1. You need to ensure a naming strategy, and the reason for this is the fact that Cypress decides the order of execution for test files based on alphabetical order. In this example, we use this naming strategy: - `xx_app_spec.js` for the spec files that visit the app and create events - `xx_micro_spec.js` for the spec files that query Micro and make the assertions on those events , where `xx` stands for numbering (but it could be anything as long as it matches uniquely). ```bash $ tree testing/cypress/integration testing/cypress/integration/ ├── 01_app_spec.js ├── 01_micro_spec.js ├── 02_app_spec.js ├── 02_micro_spec.js ├── 03_app_spec.js ├── 03_micro_spec.js ``` 2. You need to consider how and when you reset Micro. We chose to use a Before [hook](https://docs.cypress.io/guides/core-concepts/writing-and-organizing-tests.html#Hooks) in the start of every `xx_app_spec.js` file, since the naming strategy ensures that the tests will run in `app`-`micro` pairs. 3. If you just want to run only a particular app test file (and not all of them), you will also need its corresponding Micro test file. Since this is a usual case, for example, when a particular spec file fails, we added another npm script `cy-micro:pair-run`, which will match a naming pattern and run the corresponding micro-spec file after the matching app spec. The script can be seen in [package.json](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/package.json). Example usage: 1. To run all spec files in your `cypress/integration` directory (in alphabetical order) ```bash npm run cypress:run ``` 2. To run an app-micro pair of spec files given a name pattern ```bash PAIR_PATTERN=03 npm run cy-micro:pair-run # will run first the 03_app_spec.js and then the 03_micro_spec.js ``` Just make sure that this name pattern is unique for this pair. This kind of organization also has the benefit, that you can keep having the tests you normally had for your app (just adding the before-hook to reset Micro), and just add a corresponding micro-spec file, to test the events emitted from the original run of that app test. That way, you can test your app's features in the `app_spec` files and your tracking implementation in the corresponding `micro_spec` files. ### Commands Since Cypress allows to define your own [custom commands](https://docs.cypress.io/api/cypress-api/custom-commands.html), in this repo you can find commands specifically for use with Snowplow Micro and assertions of events. You can see them all in [commands.js](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/testing/cypress/support/commands.js). #### cy.noBadEvents _Example call_: ```javascript cy.noBadEvents(); ``` Even if this is the only thing that you check in your tests, you are already brilliant. It is going to ensure that your app is not sending any bad events, in other words you ensure that all your events end up in your warehouse. There are no more gaps in your data or in your analytics and no recovery jobs to get those bad events back, jobs that are not going to be trivial, especially if you are dealing with high volume of events. #### cy.numGoodEvents _Example call_: ```javascript cy.numGoodEvents( 19 ); ``` This command ensures that all the events you want to track, actually get tracked. Because there may be not bad events, but maybe your tracker is not implemented correctly into the app's logic. #### cy.eventsWithEventType _Example call_: ```javascript cy.eventsWithEventType( "page_view", 8 ); cy.eventsWithEventType( "struct", 45 ); ``` This command is useful when you want to ensure that a particular type of events got tracked as many times as it should. #### cy.eventsWithParams _Example call_: ```javascript cy.eventsWithParams( { "event": "struct", "se_category": "Media", "se_action": "Play video", "se_label": "Surfing" }, 3 ); ``` This command accepts as first argument an object with the expected event's field-value pairs. You can read about all the fields in Snowplow docs [here](/docs/fundamentals/canonical-event/). This command is particularly useful when checking on [structured events](/docs/sources/web-trackers/custom-tracking-using-schemas/#track-a-structured-event). #### cy.eventsWithSchema _Example call_: ```javascript cy.eventsWithSchema( "iglu:com.snowplowanalytics.snowplow/submit_form/jsonschema/1-0-0", 5 ); ``` With this command you can look specifically for [self-describing events](/docs/fundamentals/events/#self-describing-events), which include both custom self-describing events and all other out-of-the-box Snowplow events that are of "unstruct" eventType (link-click, submit-form, ad-impression etc.) #### cy.eventsWithContexts _Example call_: ```javascript cy.eventsWithContexts( [ { "schema": "iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0" } ], 10 ); cy.eventsWithContexts( [ { "schema": "iglu:com.example.eg/article_context/jsonschema/1-0-0", "data": { "writer": "John Doe", "category": "Sports", "title": "The match of the year" } }, { "schema": "iglu:com.example.eg/writer_entity/jsonschema/1-0-0", "data": { "name": "John Doe", "age": 34, "numOfArticles": 50, "categories": ["sports", "history", "food"] } } ], 2 ); ``` With this command you can check whether the predefined(e.g. webpage, geolocation) or custom contexts/entities got properly attached to events. You can not only check by the schema of the entities but also by their data. Note that the first argument to this command should be an array of objects, like the contexts' array that can be attached to any Snowplow event. The keys of these objects can be either "schema" or "data". For "schema" the value should be a string (the schema). For "data" the value should be an object of key-value pairs, depending on the context. #### cy.eventsWithProperties _Example call_: ```javascript cy.eventsWithProperties( { "parameters": { "event": "unstruct", "v_tracker": "js-3.1.3" }, "schema": "iglu:com.example.eg/custom_cart_event/jsonschema/1-0-0", "values": { "type": "add", "productSku": "12345", "quantity": 1 }, "contexts": [ { "schema": "iglu:com.example.eg/custom_product_context/jsonschema/1-0-0", "data": { "sku": "12345", "name": "Laptop", "onOffer": false, }, } ] }, 1 ); ``` This is a command that combines some of the above (eventsWithSchema, eventsWithParams, eventsWithContexts), and also adds the ability to look into the data of unstructured events. The object that gets passed as the first argument, can have as keys: - "schema" : matches by schema - "values" : matches by the data of an unstructured event - "contexts" : matches by contexts - "parameters" matches by the event's fields It will return the events that have all those properties. As shown in the examples above, you do not have to use all the properties, and the command works accordingly. #### cy.eventsWithOrder _Example call_: ```javascript cy.eventsWithOrder([ { "schema": "iglu:com.snowplowanalytics.snowplow/focus_form/jsonschema/1-0-0", "values": {"elementId": "user_email"} }, { "schema": "iglu:com.snowplowanalytics.snowplow/change_form/jsonschema/1-0-0", "values": {"elementId": "user_email"} }, { "schema": "iglu:com.snowplowanalytics.snowplow/submit_form/jsonschema/1-0-0" } ]); ``` With this command you can assert that events happened in a specified (ascending) order. For example, in the call above, we can assert that the focus\_form event happened before the corresponding change\_form event, which in turn happened before the submit\_form event. The argument to this command is an array of at least 2 event "descriptions", which are exactly like the properties' object argument of `eventsWithProperties`. Those event descriptions need to uniquely identify exactly one Snowplow event. Internally, this command compares events' `derived_tstamp`. ### Some further notes ```bash $ tree testing/cypress/ testing/cypress/ ├── integration │ ├── 01_app_spec.js │ ├── 01_micro_spec.js │ ├── 02_app_spec.js │ ├── 02_micro_spec.js │ ├── 03_app_spec.js │ ├── 03_micro_spec.js │ └── helpers_spec.js ├── plugins │ └── index.js └── support ├── commands.js └── index.js ``` 1. Helpers - In the `commands.js` file, the commands are defined based on the `Micro` [helper module](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/testing/jsm/helpers.js) - The `integration/helpers_spec.js` [file](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/testing/cypress/integration/helpers_spec.js) tests the helper functions that define the matching logic. For a different custom matching logic, you can tweak the helper functions, and then test them. 2. Environment variables. - Cypress allows for [many ways](https://docs.cypress.io/guides/guides/environment-variables.html) to set environment variables. In this example we set them in the `plugins/index.js` [file](https://github.com/snowplow-incubator/snowplow-micro-examples/blob/main/testing/cypress/plugins/index.js). --- # Run Snowplow Micro through Console > Run Snowplow Micro through Snowplow Console to validate and debug tracking implementations. Send events to the collector endpoint and view results through the dashboard. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/console/ Snowplow Micro is fully integrated in Snowplow Console: - You can deploy one or more Micro-based **development environments** - Each environment is automatically connected to your development [data structures](/docs/event-studio/data-structures/) - You can manage [enrichments](/docs/pipeline/enrichments/available-enrichments/) the same way as for regular pipelines, and copy enrichment configuration from a development environment to a pipeline > **Tip:** Using a development environment is a great way to test your changes before applying them to a real pipeline. ## Setup To create a development environment, navigate to **Settings > Workspaces**, select your workspace, scroll to the **Development environments** section and click **Create environment**. > **Note:** You will need _Edit environments_ permissions to do this. You can add multiple environments, or create and delete them as you see necessary. For example, it might be convenient to create separate environments for different teams testing different code in parallel. Under the hood, an instance of Snowplow Micro will be deployed in your cloud account along with a database to store events. (This account is managed by you, in the case of Private Managed Cloud, or by Snowplow otherwise.) ## Usage Once your development environment is ready, you can access it from the **Pipelines** section in the Console sidebar. Select your environment and you will see the Collector endpoint URL you can use in your tracking code to send events to this environment. Events will be stored for a (rolling) 7 day period. To view events in the [Micro dashboard](/docs/testing/snowplow-micro/ui/), select your environment and then click **Open dashboard**. > **Warning:** Do not send production data to development environments. Anyone with the _View environments_ permission can access the dashboard and see the events. Also, development environments are not configured to withstand high volumes of events. To enable or disable enrichments and edit their configurations, select the **Enrichments** tab. The process is the same as for regular pipelines. We highly recommend testing enrichments in a development environment before deploying them to production. --- # Test and debug tracking with Snowplow Micro > Snowplow Micro is a lightweight test pipeline for debugging and automated testing. It receives, validates, and enriches events with a UI and API for inspection. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/ [Snowplow Micro](https://github.com/snowplow/snowplow-micro) is a lightweight version of the Snowplow pipeline. It's great for: - Getting familiar with Snowplow - Debugging and testing, including [automated testing](/docs/testing/snowplow-micro/automated-testing/) ![Snowplow Micro dashboard](/assets/images/overview-1728998b503b68a664fe3b2d8c0e36b8.png) Just like a real Snowplow pipeline, Micro receives, validates and enriches events sent by your [tracking code](/docs/sources/). > **Warning:** Micro is not designed for production traffic. To get started in minutes, [deploy Micro through Console](/docs/testing/snowplow-micro/console/) or [run it locally](/docs/testing/snowplow-micro/local/). --- # Configure additional Snowplow Micro settings > Enable HTTPS with SSL/TLS certificates, add custom Iglu resolver configurations, and modify collector settings. Advanced configuration options for Snowplow Micro using bind mounts and environment variables. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/local/advanced-usage/ This page describes additional configuration options for Snowplow Micro. ## Enabling HTTPS While in most cases HTTP is sufficient, you may want to enable HTTPS in Micro (for an example of when that’s useful, see [Locally resolving an existing domain name to Micro](/docs/testing/snowplow-micro/local/remote-usage/#locally-resolving-an-existing-domain-name-to-micro)). You will need an SSL/TLS certificate in [PKCS 12](https://en.wikipedia.org/wiki/PKCS_12) format (`.p12`). Pass your certificate file and its password to the container (using a [bind mount](https://docs.docker.com/storage/bind-mounts/) and an [environment variable](https://docs.docker.com/compose/environment-variables/)). Don’t forget to expose the HTTPS port (by default, 9543): ```bash docker run -p 9090:9090 -p 9543:9543 \ --mount type=bind,source=$(pwd)/my-certificate.p12,destination=/config/ssl-certificate.p12 \ -e MICRO_SSL_CERT_PASSWORD=... \ snowplow/snowplow-micro:4.1.1 ``` > **Note:** For the certificate, the path inside the container must be exactly `/config/ssl-certificate.p12`. You should see a message like this in the logs: ```text [INFO] com.snowplowanalytics.snowplow.micro.Main$ - HTTPS REST interface bound to /0.0.0.0:9543 ``` As usual, you can change the ports to your liking (see [Running Micro](/docs/testing/snowplow-micro/local/)). ## Adding custom Iglu resolver configuration If you’d like to tweak the Iglu registries Micro uses, the priority between them, the cache sizes, etc, you can provide your own [Iglu resolver configuration](/docs/api-reference/iglu/iglu-resolver/) (`iglu.json`). > **Tip:** If you are just looking to add custom schemas or connect to your private Iglu registry, check out [Adding custom schemas](/docs/testing/snowplow-micro/local/schemas/) for simpler ways to achieve that. Pass your configuration file to the container (using a [bind mount](https://docs.docker.com/storage/bind-mounts/)) and instruct Micro to use it: ```bash docker run -p 9090:9090 \ --mount type=bind,source=$(pwd)/iglu.json,destination=/config/iglu.json \ snowplow/snowplow-micro:4.1.1 \ --iglu /config/iglu.json ``` That’s it. You can use the [the API](/docs/api-reference/snowplow-micro/api/#microiglu) to check if Micro is able to reach your schemas (replace `com.example` and `my-schema` as appropriate). ```bash curl localhost:9090/micro/iglu/com.example/my-schema/jsonschema/1-0-0 ``` ## Adding custom collector configuration If you’d like to tweak the [collector configuration](/docs/api-reference/stream-collector/configure/) inside Micro, the simplest approach is to override individual settings. For example, to change the cookie name: ```bash docker run -p 9090:9090 \ snowplow/snowplow-micro:4.1.1 \ -Dcollector.cookie.name=sp ``` For more extensive changes, you can also bring your own configuration file (`micro.conf`). **Example** ```hcl loading... ``` [View on GitHub](https://github.com/snowplow/snowplow-micro/blob/master/example/micro.conf) Pass your configuration file to the container (using a [bind mount](https://docs.docker.com/storage/bind-mounts/)) and instruct Micro to use it: ```bash docker run -p 9090:9090 \ --mount type=bind,source=$(pwd)/micro.conf,destination=/config/micro.conf \ snowplow/snowplow-micro:4.1.1 \ --collector-config /config/micro.conf ``` ## Persisting events across restarts By default, Micro only stores events in memory. Since version 4.0.0, you can connect it to a PostgreSQL database to persist events across restarts. To enable this, first create a storage configuration file: ```hcl host = "localhost" port = 5432 database = "micro_test" user = "test_user" // How long to keep the events for (minimum: 5m) ttl = "7d" // How often to clean up expired events (minimum: 1m) cleanupInterval = "1h" ``` Next, place the database password into an [environment variable](https://en.wikipedia.org/wiki/Environment_variable) named `MICRO_POSTGRESQL_PASSWORD`. Now you can run Micro and pass the configuration, as well as the password: ```bash docker run -p 9090:9090 \ --mount type=bind,source=$(pwd)/storage.conf,destination=/config/storage.conf \ -e MICRO_POSTGRESQL_PASSWORD \ snowplow/snowplow-micro:4.1.1 \ --storage /config/storage.conf ``` --- # Enable enrichments in Snowplow Micro > Add enrichments to Snowplow Micro including YAUAA, IP Lookup, and JavaScript enrichment. Mount enrichment configuration files to test pipeline behavior locally. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/local/enrichments/ By default, when running locally, Micro does not have any [enrichments](/docs/pipeline/enrichments/available-enrichments/) enabled. This page shows how you can enable them. > **Tip:** When [running Micro through Console](/docs/testing/snowplow-micro/console/), you can use the UI to configure enrichments. ## YAUAA (Yet Another User Agent Analyzer) One of the most useful enrichments, [YAUAA](/docs/pipeline/enrichments/available-enrichments/yauaa-enrichment/), requires no configuration, and can be turned on simply with the `--yauaa` flag (since version 3.0.1): ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --yauaa ``` > **Tip:** We recommend not enabling this enrichment in a CI setting unless necessary, because it consumes a few extra hundred MB of RAM. ## Other enrichments You can enable any enrichments you like by passing corresponding configuration files to Micro. **Limitations for enrichments that rely on data files** Some enrichments require data files (e.g. a database of IPs). The Enrich application in a full Snowplow pipeline will automatically download and periodically update these files. However, Micro will only download them once. You can always restart Micro to get a fresher copy of the files. You might also face access limitations where the database files are hosted on S3, GCS, or ADLS and are authenticated via the cloud environment roles and permissions. For example, let’s say that you want to configure the [IP Lookup enrichment](/docs/pipeline/enrichments/available-enrichments/ip-lookup-enrichment/). The default configuration file looks like this: ```json loading... ``` [View on GitHub](https://github.com/snowplow/enrich/blob/master/config/enrichments/ip_lookups.json) Put this file somewhere on the machine where you are running Micro, let’s say `my-enrichments/ip_lookups.json`. (Feel free to add any other configurations to `my-enrichments`). Now you will need to pass this directory to the Docker container (using a [bind mount](https://docs.docker.com/storage/bind-mounts/)): ```bash docker run -p 9090:9090 \ --mount type=bind,source=$(pwd)/my-enrichments,destination=/config/enrichments \ snowplow/snowplow-micro:4.1.1 ``` > **Note:** The directory _inside_ the container (what goes after `destination=`) must be exactly `/config/enrichments`. Alternatively, if you are running Micro as a Java application, put your enrichment configurations in `some-directory/enrichments` on your machine (`enrichments` must be called exactly that) and use the following command: ```bash java -cp micro-4.1.1.jar:some-directory com.snowplowanalytics.snowplow.micro.Main ``` Once Micro starts, you should see messages like these: ```text [INFO] com.snowplowanalytics.snowplow.micro.Run - Downloading http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind/GeoLite2-City.mmdb... [INFO] com.snowplowanalytics.snowplow.micro.Run - Enabled enrichments: IpLookupsEnrichment ``` > **Tip:** Micro is especially great for [testing the JavaScript enrichment](/docs/pipeline/enrichments/available-enrichments/custom-javascript-enrichment/testing/). --- # Run Snowplow Micro locally > Run Snowplow Micro locally using Docker to validate and debug tracking implementations. Send events to port 9090 and view results through the dashboard, API, or exported TSV/JSON format. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/local/ > **Tip:** For general use, we recommend [running Micro through Snowplow Console](/docs/testing/snowplow-micro/console/). However, running locally can be useful if you don’t have access to Console or need an advanced configuration. The easiest way to run Micro locally is through [Docker](https://www.docker.com/). Run the following command: ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 ``` You should see output like this: ```text [INFO] com.snowplowanalytics.snowplow.micro.Run - No enrichments enabled [INFO] com.snowplowanalytics.snowplow.micro.MicroHttpServer - Building blaze server [INFO] org.http4s.blaze.channel.nio1.NIO1SocketServerGroup - Service bound to address /[0:0:0:0:0:0:0:0]:9090 ``` **A note on ports...** The command above will route port `9090` on your machine to Micro. If that port is already taken, you will want to change it, like so: ```bash docker run -p 5000:9090 snowplow/snowplow-micro:4.1.1 ↑↑↑↑ ``` Note that Micro will still log `REST interface bound to /0.0.0.0:9090` — `9090` here refers to the port _inside_ the container. We will use `9090` in the examples below, but remember to substitute the port of your choosing. > **Tip:** We also provide a _distroless_ image of Micro. Because it only includes the bare minimum, it’s smaller and more secure. However, the downside of using the distroless image is that basic utilities (such as a shell) are not available. > > To use this image, add `-distroless` to the tag: > > ```bash > docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1-distroless > ``` If Docker is not available on your system, you can also run Micro as a Java application (since version 2.3.1). The `jar` files are availble via [GitHub releases](https://github.com/snowplow/snowplow-micro/releases). ```bash java -jar micro-4.1.1.jar ``` Most of the examples on this and the following pages will show the Docker option, but feel free to adapt them accordingly. ## Sending events to Micro Follow the documentation for one of [our trackers](/docs/sources/) to implement some tracking code on your website or application. You can then point it to `localhost:9090` where Micro is listening. For example, using the [Browser tracker](/docs/sources/web-trackers/): ```js import { newTracker, trackPageView, enableActivityTracking } from '@snowplow/browser-tracker'; newTracker('snowplow', 'localhost:9090', { appId: 'my-app-id', plugins: [], }); trackPageView(); enableActivityTracking({ heartbeatDelay: 10, minimumVisitLength: 10, }); ``` ## Checking the results Once you have the tracking code and the events are flowing in, you should see something like this in the Micro logs: ```text [INFO] EventLog - GOOD id:4bfd5b32-d02a-4f83-a731-4339898437e1 app_id:test type:page_view (iglu:com.snowplowanalytics.snowplow/page_view/jsonschema/1-0-0) [INFO] EventLog - GOOD id:e7a5c64d-d0f7-48d9-9a50-be044db7d2f6 app_id:test type:page_ping (iglu:com.snowplowanalytics.snowplow/page_ping/jsonschema/1-0-0) [INFO] EventLog - GOOD id:1608ca85-f5f9-4948-898a-728aa8f1131b app_id:test type:page_ping (iglu:com.snowplowanalytics.snowplow/page_ping/jsonschema/1-0-0) ``` This means your tracking is set up correctly and your events are valid (`GOOD`). Would you rather see the events visually? Open <http://localhost:9090/micro/ui> in your browser. You might want to check [a few tips on how to use the dashboard](/docs/testing/snowplow-micro/ui/). ![Micro dashboard overview](/assets/images/overview-1728998b503b68a664fe3b2d8c0e36b8.png) Alternatively, you can inspect the events via [the API](/docs/api-reference/snowplow-micro/api/). For example, try: ```bash curl localhost:9090/micro/good ``` ## Exporting events Snowplow pipelines output data in the [_enriched TSV format_](/docs/pipeline/enriched-tsv-format/). Typically, this is picked up by one of our [loaders](/docs/destinations/warehouses-lakes/) or by tools such as [Snowbridge](/docs/api-reference/snowbridge/). With Micro, you can see what your data would look like in this format — useful if you want to test any logic that is parsing this data. ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --output-tsv ``` Since version 2.4.0, you can alternatively output the data in JSON format (the same as provided in [Snowplow Analytics SDKs](/docs/api-reference/analytics-sdk/)): ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --output-json ``` Also since version 2.4.0 you can send the data in either format to an HTTP endpoint, rather then standard output: ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --output-tsv \ --destination http://some-url.com ``` **Output vs logs** The TSV or JSON data will be printed to the [standard output](https://en.wikipedia.org/wiki/Standard_streams#Standard_output_$stdout$). As you saw above, Micro also prints logs, which go into the [standard error stream](https://en.wikipedia.org/wiki/Standard_streams#Standard_error_$stderr$). Depending on how you are running Micro, you might find the logs distracting. If so, you can turn off event logs with an extra option: ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --output-tsv \ -Dorg.slf4j.simpleLogger.log.EventLog=off ``` Or just discard the standard error stream entirely using the syntax appropriate for your shell: ```bash # for bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --output-tsv \ 2>/dev/null ``` Finally, you can save the output to a file: ```bash docker run -p 9090:9090 snowplow/snowplow-micro:4.1.1 --output-tsv > output.tsv ``` --- # Connect remote sites and apps to Snowplow Micro > Test remote websites and apps with Micro using ngrok or localtunnel for public access, or use local domain resolution to receive first-party cookies. Configure HTTPS and match collector settings. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/local/remote-usage/ If you are not running your website or app locally but would still like to use Micro for local testing and debugging, there are two options: | | Option 1 [Exposing Micro via a public domain name](#exposing-micro-via-a-public-domain-name) | Option 2 [Locally resolving an existing domain name to Micro](#locally-resolving-an-existing-domain-name-to-micro) | | --------------------------------------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | | Requires control over the tracking code | Yes | No | | Accessible from multiple devices | Yes | No | | Micro receives first-party cookies | No | Yes | | Micro receives external IP addresses | Yes | No | ## Exposing Micro via a public domain name > **Tip:** This method is for running Micro locally. You can also [run Micro through Console](/docs/testing/snowplow-micro/console/), which automatically gives you a public endpoint. This method allows you to get a publicly accessible URL for your Micro, which you can point your tracking code to. You can also send this URL to colleagues for them to inspect the results in the [dashboard](/docs/testing/snowplow-micro/ui/) (since version 2.0.0) or via the [API](/docs/api-reference/snowplow-micro/api/). ```mermaid flowchart LR website(Website or app) tracking(Tracking code) url(URL like <code>https://22d8-....ngrok.app</code>) tunnel{{"<i>Tunneling magic</i> 🪄"}} micro(Micro on your machine) website --> tracking --> url --> tunnel --> micro ``` The easiest way to achieve this is with a tool like [ngrok](https://ngrok.com/) or [localtunnel](https://theboroer.github.io/localtunnel-www/). > **Note:** Either tool will take care of HTTPS and the required certificates. However, the resulting URL will not be under your domain, so you will not be able to set or receive first-party cookies this way. > > You will, however, receive external IP addresses (not something like `192.168.0.42`) when you or others browse the website or app pointing to Micro. This might be useful for testing certain enrichments. After running Micro as usual, you just need to expose the port (by default, `9090`): **ngrok:** [Sign up](https://dashboard.ngrok.com/signup), [download](https://ngrok.com/download) ngrok and follow the instructions to authenticate your client. Then run this command: ```bash ngrok http 9090 ``` You will see the publicly available URL in the output. **localtunnel:** Install: ```bash npm install -g localtunnel ``` Then run this command: ```bash lt --port 9090 ``` You will see the publicly available URL in the output. Before use, visit this URL in your web browser and click “Continue”. *** ## Locally resolving an existing domain name to Micro This method only works on your machine, but it allows you to connect any website or app to your Micro, even if you don’t control the tracking code. ```mermaid flowchart LR website(Website or app) tracking(Tracking code) url(Existing collector URL like <code>c.example.com</code>) collector(Actual collector) micro(Micro on your machine) url -.-x collector website --> tracking --> url -- "<i>Custom local domain<br/>resolution rule</i>" --> micro ``` Let’s say you have a website `example.com` with Snowplow tracking that points to a Collector hosted at `c.example.com`. > **Note:** With this approach, because Micro is “pretending” to be behind the actual Collector domain (`c.example.com`), it will receive first-party cookies set for that domain. > > However, you will only get local IP addresses (like `192.168.0.42`) in your data, because the traffic never leaves your machine. This might be relevant for testing some enrichments. > > If you are using a web browser to test your site or app, you can spoof a specific IP address using a browser plugin that sets an `X-Forwarded-For` header. For example, here are plugins for [Chrome](https://chrome.google.com/webstore/detail/x-forwarded-for-header/hkghghbnihliadkabmlcmcgmffllglin/) and [Firefox](https://addons.mozilla.org/en-US/firefox/addon/x-forwarded-for-injector/). Install the plugin and set the IP address to your liking. ### Generate a local SSL/TLS certificate Chances are that the website in question uses HTTPS. If so, you will need to configure Micro to [enable HTTPS](/docs/testing/snowplow-micro/local/advanced-usage/#enabling-https). Otherwise, feel free to skip this step. First, install [`mkcert`](https://github.com/FiloSottile/mkcert). This tool allows you to easily generate SSL/TLS certificates that your machine trusts. Now, run the following commands in a terminal (make sure to substitute your Collector domain for `c.example.com`): ```bash # one-time setup mkcert -install # generate a certificate mkcert -pkcs12 c.example.com ``` You should now have a local file called `c.example.com.p12` with the default password `changeit`. ### Match the Collector configuration To make sure your Micro behaves the same way as the Collector it’s “pretending” to be, copy the relevant parts of your Collector configuration and [pass them to Micro](/docs/testing/snowplow-micro/local/advanced-usage/#adding-custom-collector-configuration). The two most important settings are the cookie name (you can set it with `-Dcollector.cookie.name` as shown on the page linked above) and any custom paths (via a configuration file). ### Run Micro You will need to run Micro as usual, with a few changes: - [Enable HTTPS](/docs/testing/snowplow-micro/local/advanced-usage/#enabling-https), if needed - Use port `80` instead of `9090` - If HTTPS is enabled, also use port `443` instead of `9543` If you don’t need HTTPS: ```bash docker run -p 80:9090 snowplow/snowplow-micro:4.1.1 ``` If you need HTTPS (substitute your Collector domain for `c.example.com`): ```bash docker run -p 80:9090 -p 443:9543 \ --mount type=bind,source=$(pwd)/c.example.com.p12,destination=/config/ssl-certificate.p12 \ -e MICRO_SSL_CERT_PASSWORD=changeit \ snowplow/snowplow-micro:4.1.1 ``` ### Modify the hosts file To conclude the setup, you will have to modify your [hosts file](https://en.wikipedia.org/wiki/Hosts_$file$) (`/etc/hosts` on Linux/macOS, `C:\Windows\System32\drivers\etc\hosts` on Windows) to point `c.example.com` — the Collector domain — to your local machine. Add these lines: ```text 127.0.0.1 c.example.com ``` _With this change, whenever your system tries to send data to `c.example.com`, it will send it to your Micro instead._ Done! > **Tip:** Don’t forget to revert the hosts file change once no longer necessary. --- # Add custom schemas to Snowplow Micro > Connect Snowplow Micro to Iglu registries or add schemas directly. Point Micro to Console registries, Iglu Server, or mount local schema files for custom event validation. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/local/schemas/ One of the benefits of using Snowplow is that you can design your own schemas for your events. > **Tip:** See [our explanation](/docs/fundamentals/schemas/) on what schemas are for and what they look like. To track an event with a custom schema, you would need code like this (using the [Browser tracker](/docs/sources/web-trackers/custom-tracking-using-schemas/) as an example): ```js import { trackSelfDescribingEvent } from '@snowplow/browser-tracker'; trackSelfDescribingEvent({ event: { schema: 'iglu:com.example/my-schema/jsonschema/1-0-0', data: { ... } } }); ``` For Micro to understand this event, it will need to know about `com.example/my-schema/jsonschema/1-0-0` or any other relevant schemas. There are two ways you can achieve this: - **Point Micro to an Iglu registry that contains your schemas.** This is a good option if you use Snowplow [Console](/docs/event-studio/data-structures/) UI or [API](/docs/event-studio/programmatic-management/data-structures-api/) to create schemas, or if you have deployed your own Iglu registry. - **Add schemas to Micro directly.** This can be handy for quickly testing a schema. Whichever approach you choose, you can use the [the API](/docs/api-reference/snowplow-micro/api/#microiglu) to check if Micro is able to reach your schemas (replace `com.example` and `my-schema` as appropriate). ```bash curl localhost:9090/micro/iglu/com.example/my-schema/jsonschema/1-0-0 ``` > **Tip:** If you are [running Micro through Console](/docs/testing/snowplow-micro/console/), it automatically integrates with your development Iglu Server, so you don’t need to follow these steps. ## Pointing Micro to an Iglu registry Place your Iglu registry URL and API key (if any) into two [environment variables](https://en.wikipedia.org/wiki/Environment_variable): `MICRO_IGLU_REGISTRY_URL` and `MICRO_IGLU_API_KEY`. Make sure to fully spell out the URL, including the protocol (`http://` or `https://`). For most Iglu registries, including those provided by Snowplow CDI, the URL will end with `/api` — make sure to include that part too, for example: `https://com-example.iglu.snplow.net/api`. [Static registries](/docs/api-reference/iglu/iglu-repositories/static-repo/), such as `https://iglucentral.com`, are an exception — you don’t need to append `/api` to the URL. > **Tip:** You can find your Iglu registry URLs and generate API keys [via the console](https://console.snowplowanalytics.com/iglu-keys). The following Docker command will pick up the environment variables and pass them to Micro: ```bash export MICRO_IGLU_REGISTRY_URL=https://com-example.iglu.snplow.net/api export MICRO_IGLU_API_KEY=abcdef123456 docker run -p 9090:9090 \ -e MICRO_IGLU_REGISTRY_URL \ -e MICRO_IGLU_API_KEY \ snowplow/snowplow-micro:4.1.1 ``` This will ensure Micro uses your Iglu registry, in addition to [Iglu Central](/docs/api-reference/iglu/iglu-repositories/iglu-central/). For more flexibility, see [Advanced usage](/docs/testing/snowplow-micro/local/advanced-usage/#adding-custom-iglu-resolver-configuration). ## Adding schemas directly to Micro > **Note:** Currently, this method does not work for [marking schemas as superseded](/docs/fundamentals/schemas/versioning/#mark-a-schema-as-superseded). Structure your schema file or files like so: ```text schemas └── com.example └── my-schema └── jsonschema ├── 1-0-0 └── 1-0-1 ``` > **Note:** This folder structure is significant. Also note that the schema files must be named `1-0-0`, `1-0-1`, and so on, **not** `1-0-0.json` or `1-0-1.json`. Next, you will need to place the schemas in `/config/iglu-client-embedded/` inside the container. ```bash docker run -p 9090:9090 \ --mount type=bind,source=$(pwd)/schemas,destination=/config/iglu-client-embedded/schemas \ snowplow/snowplow-micro:4.1.1 ``` > **Tip:** You can read more about bind mounts in the [Docker documentation](https://docs.docker.com/storage/bind-mounts/). Alternatively, if you are running Micro as a Java application, place your schemas — using the same structure as above — in `some-directory/iglu-client-embedded` on your machine (`iglu-client-embedded` must be called exactly that) and use the following command: ```bash java -cp micro-4.1.1.jar:some-directory com.snowplowanalytics.snowplow.micro.Main ``` --- # View and filter events in the Snowplow Micro dashboard > Access the Micro dashboard to view, filter, and investigate good and bad events. Share the dashboard with colleagues through public domain names. > Source: https://docs.snowplow.io/docs/testing/snowplow-micro/ui/ Micro provides a user interface that helps you explore events sent to it, including [failed events](/docs/fundamentals/failed-events/). If you are [running Micro through Console](/docs/testing/snowplow-micro/console/), navigate to the **Pipelines** side menu and select your Micro. Then click **Open dashboard**. You will need the _View environments_ permission to access the dashboard. If you are [running Micro locally](/docs/testing/snowplow-micro/local/), head to <http://localhost:9090/micro/ui> in your web browser. Here’s an overview of the dashboard functionality: --- # Set up an abandoned browse campaign in Braze > Learn how to create and configure an abandoned browse email campaign in Braze using product view data synced from Census. Set up triggers, templates, and test your campaign to re-engage users who viewed products but didn't purchase. > Source: https://docs.snowplow.io/tutorials/abandoned-browse-ccdp/braze-campaign/ This guide will walk you through setting up and testing an abandoned browse campaign in Braze using the product view data synced from Census. ## Setting up the campaign 1. Log into your Braze dashboard 2. Navigate to **Campaigns** and click **Create campaign** 3. Select **Email** as the channel 4. Name your campaign (e.g. "Abandoned browse - product reminder") ![Create Campaign](/assets/images/retl-braze-create-campaign-b03594aa1c9aa2f02db4bdecb1243c07.png) ## Configure campaign trigger 1. In the **Delivery** section, select **Action-Based** 2. Set up the trigger criteria ## Create email template ![Braze Campaign Builder](/assets/images/retl-braze-preview-8b56ef2f1fd9bb457740995c290770a8.png) 1. Click **Edit Campaign** in the campaign builder and click **Edit Email Body** 2. Design your email using Braze's visual editor: ```html Subject: Don't miss out on {{custom_attribute.${product}}}! Hey {{${first_name}}} we saw you were interested in {{custom_attribute.${PRODUCT}}} <a href="{{custom_attribute.${product_url}}}?abandonedEmail=true"> You spent {{custom_attribute.${TIME_ENGAGED_IN_S}}} seconds checking it out.... Why not check it out once more before it sells out?!! </a> ``` 3. Add personalization: - product name using `{{custom_attribute.${product}}}` - product URL using `{{custom_attribute.${product_url}}}` - add dynamic product images if available > **Tip:*** The `abandonedEmail=true` parameter in the URL helps track when users click through from abandoned browse emails > > * You can use this parameter to: > > - track email campaign success > - remove users from the campaign audience once they've engaged. See image below for how to configure this in your Reverse ETL audience. ![Filter Winback](/assets/images/retl-winback-filtered-2091ae0f18dc04133a5875f4e77163c4.png) ## Campaign settings 1. Configure timing: - set delay after trigger e.g. 1 hour - set quiet hours e.g. 9 PM - 9 AM local time - set frequency capping e.g. max 1 email per user per 24 hours 2. Set conversion tracking: - primary conversion: "Purchase" - secondary conversion: "Add to Cart" ## Testing the campaign 1. Return to **Edit Campaign** and click **Preview** at the bottom of the page - select **Search User** under preview message as user - if the user can't be found, ensure Census has synced product view data for this user - verify custom attributes are populated correctly in the email - click **Send Test** to send the test email 2. Verify email content: - check all personalization renders correctly - verify product links work - test on multiple email clients 3. Verify winback success: - click on the link you receive in the test email with the `abandonedEmail=true` parameter - in Snowflake or Census, check the `winback_successful` column for the user has been set to true *** ![Braze Test Email](/assets/images/retl-email-0888d05d2f92d2d747eca5c1bf5a1083.png) *** ## Best practices - keep email content focused on one product - include social proof (reviews, ratings) - add sense of urgency (limited time offer) - ensure mobile responsiveness - include clear unsubscribe option ## Congratulations Thanks for your effort! You have now set up an abandoned browse campaign in Braze, sent an email, and verified winback success. Let's review what we have achieved in the [conclusion](/tutorials/abandoned-browse-ccdp/conclusion/). --- # Conclusions and next steps from the abandoned browse composable CDP accelerator > Review what you've accomplished in this abandoned browse tutorial and explore recommended next steps including advanced segmentation, A/B testing, and personalization strategies to improve conversion rates. > Source: https://docs.snowplow.io/tutorials/abandoned-browse-ccdp/conclusion/ Congratulations on completing the tutorial on setting up an abandoned browse tracking and re-engagement system using Snowplow, Snowflake, Braze, and Census. Here's a summary of what you've learned and achieved. ## Summary of achievements 1. **Tracking setup**: You have successfully implemented product view and engagement tracking using Snowplow's JavaScript tracker. This includes setting up page view, time spent, and add-to-cart tracking. 2. **Data modeling**: You have learned how to model your data to identify users who have shown interest in products but have not completed a purchase. This involved writing SQL queries to aggregate and analyze user behavior data, including tracking successful winback campaigns through URL parameters. 3. **Reverse ETL**: You have set up a reverse ETL workflow using Census to sync your modeled data to a marketing automation platform like Braze. This enables you to create targeted re-engagement campaigns. 4. **Campaign creation**: You have created and configured an abandoned browse email campaign in Braze, utilizing personalized content and tracking parameters to re-engage users, drive conversions, and measure campaign success. ## Recommended next steps - **Expand tracking**: extend your tracking setup to include additional user interactions, checkout steps, and campaign success metrics to create a more comprehensive view of user behavior and marketing effectiveness. - **Advanced segmentation**: use the data collected to create more advanced audience segments, such as users with high engagement but no purchase, or users who frequently view specific product categories. - **A/B testing**: implement A/B testing for different campaign elements, such as subject lines, send times, and content layouts, to identify what resonates best with your audience. - **Personalization**: enhance your campaigns with more personalized content, such as product recommendations based on browsing history or dynamic pricing offers. - **Integration with other tools**: consider integrating additional tools and platforms, such as CRM systems, to enrich your data with the customer's name and other data to improve campaign effectiveness. By following these steps, you can continue to refine and expand your abandoned browse tracking and re-engagement system, ultimately driving higher conversion rates and improving customer engagement. Thank you for following this tutorial. We hope it has been informative and helpful. --- # Model the data to identify unfinished purchases > Write SQL queries in Snowflake to identify users who viewed products but didn't complete purchases. Aggregate product view data and calculate engagement metrics for abandoned browse campaigns. > Source: https://docs.snowplow.io/tutorials/abandoned-browse-ccdp/data-modeling/ After implementing the tracking setup, send some events and test that they are arriving in your data warehouse. If you don't have a data warehouse, you can sign up for a [free trial of Snowflake](https://www.snowflake.com/). Once you have sent data to Snowflake, you can run the following query to verify the events are coming through. The [Snowflake Streaming loader](http://localhost:3000/docs/api-reference/loaders-storage-targets/snowflake-streaming-loader/) has a latency of several seconds so you should see results immediately. ```sql SELECT domain_userid, user_id, page_urlpath AS product_id, event_name, page_title, unstruct_event_com_snowplowanalytics_snowplow_ecommerce_snowplow_ecommerce_action_1, contexts_com_snowplowanalytics_snowplow_ecommerce_product_1, contexts_com_snowplowanalytics_snowplow_web_page_1 FROM DATABASE.ATOMIC.EVENTS WHERE DATE(load_tstamp) = CURRENT_DATE() ORDER BY load_tstamp DESC; ``` The output of this query should be similar to below: ![Atomic Events](/assets/images/retl-atomic-events-215e6baee1010e1b127281382d11c76d.png) ## Identifying most viewed but not added-to-cart products Once events are confirmed, use the following query to check that we can aggregate the data correctly. This query will be used in the next step to build an abandoned browse audience in Census. The SQL query identifies users who have viewed product pages today but have not completed a purchase. It aggregates data from Snowplow events to calculate the total time a user has engaged with product pages, checks if they have added items to their cart, and determines if they have previously received a win-back email. The second part of the query then selects the most engaged product for each user and ensures that only users who are logged in are included. Snowplow loads to most data warehouses / lakes in real-time so this query will include users that have been active within the last couple seconds! ```sql WITH productsViewedToday AS ( SELECT domain_userid, page_urlpath AS product_id, MAX(user_id) AS email, MAX(product.value:name::STRING) AS product, 5 * SUM(CASE WHEN event_name = 'page_ping' THEN 1 ELSE 0 END) AS time_engaged_in_s, MAX( CASE WHEN ecom_action.value:type = 'add_to_cart' THEN TRUE ELSE FALSE END ) AS add_to_cart, MAX( CASE WHEN page_urlquery = 'abandonedEmail=true' THEN TRUE ELSE FALSE END ) AS winback_successful, MAX(page_url) AS product_url FROM SNOWPLOW_SALES_AWS_PROD1_DB.ATOMIC.EVENTS, LATERAL FLATTEN(input => contexts_com_snowplowanalytics_snowplow_ecommerce_product_1) product, LATERAL FLATTEN(input => contexts_com_snowplowanalytics_snowplow_web_page_1) page, LATERAL FLATTEN(input => unstruct_event_com_snowplowanalytics_snowplow_ecommerce_snowplow_ecommerce_action_1) ecom_action WHERE DATE(load_tstamp) = CURRENT_DATE() AND page_urlpath LIKE '/product%' GROUP BY 1, 2 ORDER BY time_engaged_in_s DESC ) SELECT a.* FROM productsViewedToday a LEFT JOIN productsViewedToday b ON a.email = b.email AND a.time_engaged_in_s < b.time_engaged_in_s WHERE b.time_engaged_in_s IS NULL AND a.email IS NOT NULL; ``` The output of this query should be similar to below: ![Aggregated Query](/assets/images/retl-aggregated-query-1878273d3c24faa254ac5c89f49f66bd.png) ## Next step Proceed to the [reverse ETL setup](/tutorials/abandoned-browse-ccdp/reverse-etl/) to sync this data to your marketing platform. --- # Learn how to implement an abandoned browse system with a composable CDP > Build an abandoned browse tracking and re-engagement system using Snowplow, Snowflake, Census, and Braze. Learn how a composable CDP approach helps recover lost ecommerce conversions through personalized campaigns. > Source: https://docs.snowplow.io/tutorials/abandoned-browse-ccdp/introduction/ Abandoned browse is a common ecommerce problem where users show interest in products but don't complete a purchase. It is also referred to as "shopping cart abandonment," "abandoned basket," or "abandoned cart." Despite how common it is, it is still a challenge to implement a successful re-engagement campaign when using traditional marketing tools because they lack all the context needed to create a compelling personalized message. At Snowplow, we have found that a composable CDP approach is the best way to solve this problem. This tutorial has been written to show that it is straightforward to get started. *** ![Abandoned Browse](/assets/images/retl-email-0888d05d2f92d2d747eca5c1bf5a1083.png) *** This tutorial demonstrates how to implement an abandoned browse tracking and re-engagement system using [Snowplow](https://snowplow.io/), [Snowflake](https://www.snowflake.com/), and [Census](https://www.getcensus.com/). This solution helps ecommerce businesses identify and re-engage users who have shown interest in a product (e.g., viewed something for 10+ seconds) but haven't proceeded further. *** ![Composable CDP](/assets/images/retl-snowplow-composable-cdp-20c8ca5ec4dcbc667e2ad13607fbb80d.png) *** ## Prerequisites - An ecommerce website with a product catalog to track events from - **Snowplow instance**: - [Localstack](https://github.com/snowplow-incubator/snowplow-local) (recommended) - [Community edition](/docs/get-started/self-hosted/) - Snowplow CDI if you're already a customer - **Access to a data warehouse**: e.g., [Snowflake](https://www.snowflake.com) - **Reverse ETL**: [Census Reverse ETL](https://www.getcensus.com) or Snowplow Reverse ETL - **Marketing automation platform**: e.g., [Braze](https://www.braze.com) ## What you'll learn - How to implement product view tracking using Snowplow's JavaScript tracker - Setting up time-on-page tracking to measure user engagement - Creating SQL queries to identify abandoned browse behavior - Implementing Reverse ETL workflows to sync data to marketing platforms - Building automated re-engagement campaigns ## Business outcomes - Identify users showing genuine interest in products - Measure product engagement through view time and interaction - Create targeted re-engagement campaigns - Increase conversion rates through personalized messaging - Track campaign effectiveness and ROI ## Similar use cases This solution can be adapted for: 1. **Abandoned cart recovery**: extend the tracking to include cart additions and checkout steps 2. **Product recommendations**: use viewing patterns to suggest related items 3. **Category affinity analysis**: understand user preferences across product categories 4. **Price drop alerts**: notify users when viewed items go on sale 5. **Inventory alerts**: alert users when viewed out-of-stock items become available By following this tutorial, you'll establish a complete abandoned browse tracking and re-engagement system that can be expanded to support various ecommerce marketing initiatives. ## Next step Proceed to the [tracking setup](/tutorials/abandoned-browse-ccdp/tracking-setup/) step to implement the tracking setup. --- # Integrate with with Census using reverse ETL > Set up Census reverse ETL to sync abandoned browse audiences from Snowflake to Braze. Configure data sources, create audience segments, and automate marketing campaign triggers. > Source: https://docs.snowplow.io/tutorials/abandoned-browse-ccdp/reverse-etl/ Next we will set up a Census sync to build an audience using our query from Snowflake, filter the audience to focus on users who have shown interest in products but haven't purchased, and sync the data to Braze. ## Connect Census to Snowflake 1. Log into your Census account ([sign up if needed](https://www.getcensus.com/)) 2. Go to **Sources** 3. Click **New Source** 4. Select **Snowflake** 5. Enter your Snowflake connection details. Refer to the [documentation](https://docs.getcensus.com/sources/available-sources/snowflake) if you have any questions 6. Test the connection and save ## Create your abandoned browse audience Use the query from the [data modeling](/tutorials/abandoned-browse-ccdp/data-modeling/#identifying-most-viewed-but-not-added-to-cart-products) step to identify users who have shown interest in products but haven't purchased. Here it is again: ```sql WITH productsViewedToday AS ( SELECT domain_userid, page_urlpath AS product_id, MAX(user_id) AS email, MAX(product.value:name::STRING) AS product, 5 * SUM(CASE WHEN event_name = 'page_ping' THEN 1 ELSE 0 END) AS time_engaged_in_s, MAX( CASE WHEN ecom_action.value:type = 'add_to_cart' THEN TRUE ELSE FALSE END ) AS add_to_cart, MAX( CASE WHEN page_urlquery = 'abandonedEmail=true' THEN TRUE ELSE FALSE END ) AS winback_successful, MAX(page_url) AS product_url FROM SNOWPLOW_SALES_AWS_PROD1_DB.ATOMIC.EVENTS, LATERAL FLATTEN(input => contexts_com_snowplowanalytics_snowplow_ecommerce_product_1) product, LATERAL FLATTEN(input => contexts_com_snowplowanalytics_snowplow_web_page_1) page, LATERAL FLATTEN(input => unstruct_event_com_snowplowanalytics_snowplow_ecommerce_snowplow_ecommerce_action_1) ecom_action WHERE DATE(load_tstamp) = CURRENT_DATE() AND page_urlpath LIKE '/product%' GROUP BY 1, 2 ORDER BY time_engaged_in_s DESC ) SELECT a.* FROM productsViewedToday a LEFT JOIN productsViewedToday b ON a.email = b.email AND a.time_engaged_in_s < b.time_engaged_in_s WHERE b.time_engaged_in_s IS NULL AND a.email IS NOT NULL; ``` ## Set up the Census sync 1. In Census, go to **Syncs** and click **Create New Sync** 2. For the source: - select your Snowflake connection - configure the authentication method 3. Configure the dataset: - select **Datasets** from the menu - select the Snowflake source - enter the SQL query above - click **Preview** and ensure it returns data ![Census Dataset](/assets/images/retl-datasets-4d90c1ead6c352f1f50790443978093a.png) 4. Configure the audience segment: - select **Segments** from the menu - click **New Segment** - under **Segment Of**, select the dataset from the previous step - add filters to the segment based on the image below - click preview to confirm that the correct data is being returned ![Census Sync](/assets/images/retl-census-7819e962fba8de71543eb2613cbdf98b.png) 5. Configure the sync settings to send data from our audience to Braze: - select **Syncs** from the menu - click **New Sync** - under **Source**, select the audience from the previous step - under **Destination**, select Braze - configure the mapping as shown in the image below - click **Run Now** ![Census Mapping](/assets/images/retl-census-mapping-8c59b71665b4069e31c3d8ab3f5100c5.png) ## Monitoring and optimization - Monitor sync status in Census dashboard - Track campaign performance metrics - A/B test different message content and timing - Adjust view time thresholds based on results ## Best practices - Review view time thresholds (10+ seconds) to ensure genuine interest - Include product images and details in your messages - Add urgency elements (e.g., limited time offers) - Test different message sequences and timings - Monitor unsubscribe rates to avoid message fatigue By following these steps, you'll have an automated system that identifies users showing genuine interest in products and triggers relevant abandoned browse campaigns to encourage purchases. ## Next step Proceed to the [creating campaigns in Braze](/tutorials/abandoned-browse-ccdp/braze-campaign/) step to set up your marketing automation campaign. --- # Capture abandoned browse behavior with the JavaScript tracker > Implement Snowplow JavaScript tracker to capture product views, user engagement time, and add-to-cart events on your ecommerce site. Track abandoned browse behavior with the ecommerce accelerator plugin. > Source: https://docs.snowplow.io/tutorials/abandoned-browse-ccdp/tracking-setup/ To begin, we will set up Snowplow tracking on your ecommerce website. In this section we will capture what products a customer views, how long they view them for and if they add them to their cart. We assume that you already have a Snowplow pipeline. If you do not yet have a pipeline running, please return to the [Introduction](/tutorials/abandoned-browse-ccdp/introduction/) information on the different deployment options. ![website](/assets/images/retl-shopfront-6dc061d7fe6da3fc3370ad6992a6d3a1.png) First, [initialize the Snowplow JavaScript tracker](/docs/sources/web-trackers/quick-start-guide/). Below is an example of how to set up the tracker: ```javascript ; (function (p, l, o, w, i, n, g) { if (!p[i]) { p.GlobalSnowplowNamespace = p.GlobalSnowplowNamespace || []; p.GlobalSnowplowNamespace.push(i); p[i] = function () { (p[i].q = p[i].q || []).push(arguments) }; p[i].q = p[i].q || []; n = l.createElement(o); g = l.getElementsByTagName(o)[0]; n.async = 1; n.src = w; g.parentNode.insertBefore(n, g) } }(window, document, "script", "https://cdn.jsdelivr.net/npm/@snowplow/javascript-tracker@latest/dist/sp.lite.js", "snowplow")); // Replace with your collector URL var collector = "yourcollector.site.com"; window.snowplow('newTracker', 'trackerName', collector, { encodeBase64: false, appId: 'ecommerceDemo', platform: 'web', contexts: { webPage: true, performanceTiming: true } }); // Add ecommerce accelerator plugin window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-snowplow-ecommerce@latest/dist/index.umd.min.js', ['snowplowEcommerceAccelerator', 'SnowplowEcommercePlugin'] ); ``` ## Track page views and user engagement time We want to understand how long a product is viewed for in order to determine the which product each customer is paying the most attention to. Use the `enableActivityTracking` function to calculate the time the user is actively engaged on the page: - **minimumVisitLength**: the minimum time (in seconds) a user must stay on the page before tracking starts - **heartbeatDelay**: the interval (in seconds) at which activity pings are sent If you change these parameters, make sure to update the values in the [data modeling](/tutorials/abandoned-browse-ccdp/data-modeling/#identifying-most-viewed-but-not-added-to-cart-products) step. This event must be called before the `trackPageView` function. ```javascript snowplow('enableActivityTracking', { minimumVisitLength: 5, heartbeatDelay: 5 }); snowplow('trackPageView'); ``` ## Test your tracking To verify your tracking implementation, use the [Snowplow Chrome extension](https://chromewebstore.google.com/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm). This extension allows you to inspect Snowplow events in real-time as they are sent from your website. Navigate to your product pages and add items to cart while monitoring the extension to ensure events are firing correctly with all expected parameters. The extension will show you the full event payload including all entities and properties, making it easy to debug your implementation. ![Chrome Extension](/assets/images/retl-chrome-extension-5fae75e18e8df7e43a5dfb1359e19fa8.png) ## Track product views Implement the product view tracking when a product is viewed. This will create a column in the warehouse dedicated to storing information on viewed products. Please ensure you use the correct data type for each variable. Refer to the [ecommerce documentation](/docs/sources/web-trackers/tracking-events/ecommerce/) for further information. ```javascript snowplow('trackProductView', { id: "12345", name: "Baseball T", brand: "Snowplow", category: "apparel", price: 200, currency: "USD", }); ``` ## Track "add to cart" events Track "add to cart" events so users that perform this action can be filtered out or be placed in a different cohort. Below is an example implementation, showing a single product being added to the cart. ```javascript window.snowplow("trackAddToCart", { products: [ { id: "P125", name: "Baseball T", brand: "Snowplow", category: "Mens/Apparel", price: 200, currency: "USD", }, ], total_value: 200, currency: "USD", }); ``` Once add to cart events are tracked, the event should look like this in your Snowplow Chrome extension. ![Add to cart](/assets/images/retl-add-to-cart-320ff009a887e9e8af27245a1726b116.png) ## Explanation of parameters - **`id`**: the unique identifier for the product - **`name`**: the product's name - **`price`**: the product's price as a floating-point number - **`brand`**: the brand associated with the product - **`currency`**: the currency code (e.g., USD, EUR) - **`category`**: the product's category or taxonomy - **`total_value`**: the updated total cart value after adding the product --- # Examine your Android screen view events with Snowplow Micro > Verify and analyze Android screen view events using Snowplow Micro. View session data, screen engagement metrics, and automatically generated screen end events with foreground and background time tracking. > Source: https://docs.snowplow.io/tutorials/android-event-tracking/examining-events/ After implementing tracking, you'll want to verify and analyze the events being sent. You can use [Snowplow Micro](/docs/testing/snowplow-micro/) for testing and development — either [through Console](/docs/testing/snowplow-micro/console/) or [locally](/docs/testing/snowplow-micro/local/). Here's how to examine the events: 1. Set up a Snowplow Micro pipeline and configure your tracker to send events to its endpoint, by updating the string in the `Analytics` object. 2. Run the app and move between screens. 3. Access the [Micro dashboard](/docs/testing/snowplow-micro/ui/) to view incoming events. 4. Each event will contain several context entities by default, providing data about the: - Session - App - Device - Current screen view - Whether the app was visible when the event was tracked (foreground or background) 5. For screen view events, you'll see: - Screen name - Previous screen name - Screen view ID - Previous screen view ID 6. The tracker also automatically generates screen end events, which include a screen summary entity with engagement data: - Time spent on the screen - Foreground vs background time For more detailed information about all available Android events and their structures, refer to the comprehensive documentation provided by Snowplow Analytics. This concludes the basic tutorial for getting started with Snowplow Analytics' Android tracker. In future tutorials, we'll explore how to track custom events and utilize more advanced features of the tracker. --- # Learn how to track events on Android > Set up the Snowplow Android tracker in your Jetpack Compose app using Kotlin. Install dependencies, initialize the tracker, and begin capturing mobile analytics events. > Source: https://docs.snowplow.io/tutorials/android-event-tracking/installation/ ## Requirements - Snowplow pipeline running - Android Studio installed ## Installation and Setup To begin tracking events using Snowplow Analytics' Android tracker, you'll need to set up your project and initialize the tracker. We’ll be following the modern JetPack Compose framework using Kotlin as our langauge. 1. Clone the official Android JetSurvey example app here, or import it from Android Studio as described [here](https://developer.android.com/develop/ui/compose/setup#sample). ```bash git clone https://github.com/android/compose-samples/tree/main cd jetsurvey ``` 2. Add the Snowplow Android tracker dependency to your project-level `libs.versions.toml` file (Replace `x.x.x` with the latest version of the tracker.) ```text snowplow = "x.x.x" ``` 3. Add the dependency to your `build.gradle.kts` file ```kotlin implementation(libs.snowplow.analytics) ``` 4. Create a new package in your project for analytics-related code, and create a `Analytics.kt` file inside it. 5. Now we’ll create the `Analytics` object in that `Analytics.kt` file by adding the following code. Replace `"YOUR_NAMESPACE"` with a unique identifier for your tracker and `"YOUR_COLLECTOR_URL"` with your Snowplow collector endpoint. ```kotlin package com.example.compose.jetsurvey.analytics import android.content.Context object Analytics { fun start(context: Context) { Snowplow.createTracker( context, namespace: "YOUR_NAMESPACE", endpoint: "YOUR_COLLECTOR_URL" ) } } ``` 6. In the entrypoint for your app, you’ll now want to initialize the tracker. In this case the first composable function called is the `JetsurveyNavHost()`, inside the `Navigation.kt` file. Add this line to the start of the function, before the `NavHost` instantiation: --- # Track screen views using the Android tracker > Implement automatic screen view tracking in Jetpack Compose apps using navigation listeners. Track screen engagement, time spent, and foreground/background status with the Snowplow Android tracker. > Source: https://docs.snowplow.io/tutorials/android-event-tracking/screen-views/ In a Jetpack Compose app, screen views aren't tracked automatically. Screen view tracking can be manually added to every screen, but it’s more efficient to add a navigation listener. Here's how to track screen views using the navigation component: 1. Within the `Analytics` object, create a listener function: ```kotlin fun addScreenViewNavListener(navController: NavController){ navController.addOnDestinationChangedListener { _, destination, -> Snowplow.defaultTracker?.track(ScreenView(destination.route ?: "unknown")) ``` 2. Call this function by passing in your navigation, in your main composable: ```kotlin @Composable fun JetSurveyNavHost( navController = rememberNavController() ) { Analytics.start(LocalContext.current) Analytics.addScreenViewNavListener(navController) ... } ``` This setup will track a screen view event every time the navigation destination changes. The screen name will be set to the route of the destination. Additionally, the Snowplow Android tracker uses the screen view events to automatically track screen engagement, including how long a user spends on each screen and whether the app is in the foreground or background. --- # Optional Attribution dbt package configuration variables > Review optional configuration variables for the Attribution dbt package. Customize conversion filters, path transforms, conversion windows, and reporting options to fit your marketing attribution needs. > Source: https://docs.snowplow.io/tutorials/attribution/before-you-start/ Marketing Attribution Analysis can be done in many ways depending on your business and needs. Our aim has always been to make it flexible for users to make changes from selecting a different data source to reshaping how the customer journey to conversion will be adjusted for the analysis. We suggest taking the time to read the documentation before running the model and also readjust the configurations later on to fit your needs. In the subsequent setup steps, we will only include the absolute minimum that's needed to be able to run it for the first time, but there are many other variables that should also be considered to fine-tune for the best outcome as the default values might not be the best for your specific analysis. Below we list a few optional variables that would dictate the modeled outcome to give you some examples: - `snowplow__conversion_hosts`: url\_hosts to process, if left empty it will include all - `snowplow__conversion_clause`: a user defined sql script to filter on specific conversions if needed. Defaulted to 'cv\_value > 0' - `snowplow__path_transforms`: a dictionary of path transforms and their arguments (see [Transform Paths](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-attribution-data-model/#transform-paths) section which includes other relevant variable changes including changing the `snowplow_path_lookback_days` and `snowplow_path_lookback_steps` variables.) - `snowplow__conversion_window_days`: by default the package processes the last complete n number of days (calculated from the last processed path in the path source) to dynamically define the conversion window for each incremental run - `snowplow__enable_attribution_overview`. By default, the package creates a model called snowplow\_attribution\_overview which can be used directly for BI reporting. If you are using the Attribution Modeling Data Model Pack, this is not required to be enabled as the Data Model Pack will take care of querying this data for you > **Info:** For an in-depth guide, follow our [package documentation](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-attribution-data-model/). For a full list of variables you can use and their definitions, check out the [Configurations](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/attribution/) page. --- # Enable conversions in the Unified Digital dbt package > Configure the conversions module in Snowplow Unified dbt package for attribution modeling. Define conversion events, enable the conversions table, and set up referral field fallbacks for marketing attribution. > Source: https://docs.snowplow.io/tutorials/attribution/enabling-unified-conversions/ Snowplow’s Attribution package requires the optional conversions module to be configured within the Unified package, because the `snowplow_unified_conversions` table is used as a source by default. This is an additional incremental table that will be updated for each run of the Unified package. Enabling Conversions will also add 6 extra columns into the sessions table. See more details about that on [this page](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/). If you already have conversion events and the conversion module is enabled, you can skip this step. > **Info:** Please note: All the actions within this step are to be completed within your Snowplow **Unified Package** dbt project. ### Configuring Conversions in Unified 1. Add `snowplow__conversion_events: []` variable to dbt\_project.yml file inside your project where you the run Snowplow Unified package. 2. Add a conversion event into the array as shown below to the dbt\_project.yml file in the vars section. You can see further details on configuring the conversion event on [this page of the docs](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/) > **Info:** Please Note: The total value of conversions must be greater than 0 for the Attribution Package as it is required to calculate the proportion of value brought in by a campaign or channel. ```yml vars: snowplow_unified: snowplow__conversion_events: [ { "name": "purchase", # Required, name of the conversion, string (must be valid SQL column name) "condition": " event_name == 'purchase' ", # Required, valid SQL condition that returns true or false "value": " 1 ", # Optional but recommended, can be field name or SQL "default_value": "0", # Optional, can be field name or SQL "list_events": "true" # Optional, boolean } ] ``` 1. Enable the Conversions Module to create the `snowplow_unified_conversions` table by setting the following variable in your Unified dbt\_project.yml file. ```yml vars: snowplow_unified: snowplow__enable_conversions: true ``` > **Warning:** If you have not previously run the Unified package with the conversions module enabled by default it will cause the Snowplow package to backfill from the `snowplow__start_date` (with `snowplow__backfill_limit_days` increments for each run) until the `snowplow_unified_conversions` table becomes up-to-date with the rest of the incremental tables. During backfilling any existing derived incremental tables (e.g. sessions table) are not going to get updated. > > Alternatively, you can adjust the run configurations the first time you run the package with the enabled conversions module to only backfill the `snowplow_unified_conversions` table from a certain more recent date only. [Read more details here](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/backfilling/). ### (Optional) Enable referral fields if marketing fields are null It can be useful to use the referral fields source and medium if the marketing fields are null, by default the marketing fields are set from UTM parameters. Having referral fields as a fallback can be enabled with a variable from within the Unified Package. 1. Set the following variable to true within your `dbt_project.yml` file ```yml vars: snowplow_unified: snowplow__use_refr_if_mkt_null: true ``` ### Run the Unified Package 1. Run the Snowplow Unified package to create the `snowplow_unified_conversions` table and include your conversion events in your `snowplow_unified_sessions` table. By default, your `snowplow_unified_conversions` table will be in the same schema/dataset as your other unified model derived tables. ```bash dbt run --select snowplow_unified ``` --- # Learn how to set up the Attribution dbt package > Set up Snowplow's Attribution dbt package to analyze marketing attribution with multi-touch models. Prerequisites include Unified package, campaign enrichment, and conversion tracking. > Source: https://docs.snowplow.io/tutorials/attribution/intro/ This tutorial walks you through how to set up Snowplow’s Attribution dbt package. It assumes you have already followed our Snowplow Unified Package [tutorial](/tutorials/unified-digital/intro/), or previously configured it, as the Attribution package requires the model outputs (e.g. unified views and conversions tables). ## Prerequisites - [dbt](https://github.com/dbt-labs/dbt) installed (and dbt profile configured for Data Warehouse connection) - have the [Campaign attribution enrichment](/docs/pipeline/enrichments/available-enrichments/campaign-attribution-enrichment/) enabled to generate the marketing campaign fields such as mkt\_medium, mkt\_source, mkt\_term, mkt\_content, and mkt\_campaign which would be needed for channel and campaign classification to ultimately calculate the appropriate attribution - have the [Referer parser enrichment](/docs/pipeline/enrichments/available-enrichments/referrer-parser-enrichment/) enabled to extract attribution data from referer URLs - Snowplow Unified Digital [dbt Package](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/) configured and run (v0.4.0 or above) it will generate the following source tables for you: - `snowplow_unified_views` table as a path (touch points) source - `snowplow_unified_conversions` table as a conversions source including the revenue (generated by the optional conversions module) - `snowplow_unified_user_mapping` table as the user mapping source (needed for user stitching) - (Optional) Marketing Spend Source table. [See more information here](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-attribution-data-model/#3-channel-spend-information-optional-but-recommended). It will contain your marketing spend data by channel and or campaign with a timestamp field which denotes the period. This is needed for the ROAS calculation within the `snowplow_attribution_overview` model. --- # Optional: Integrate marketing spend for attribution ROAS > Integrate marketing spend data with the Attribution package to calculate ROAS (return on ad spend). Configure spend sources by channel and campaign for comprehensive attribution analysis. > Source: https://docs.snowplow.io/tutorials/attribution/optional-adding-marketing-spend-source/ If you have marketing spend source data in your data warehouse then the Snowplow Attribution model package provides a configuration option that allows you to integrate your spend data with the `attribution_overview` view. 1. Create a view (or regularly updated table) based on your marketing spend data that has the columns below. Your channel and spend data will be summed separately in the `attribution_overview`, so you would not need to pre-aggregate this, it is fine to have channel twice for the same period even, but make sure that the data does not include duplicates as it might lead to unexpected outcomes. | **Column Name** | **Data Type** | **Required** | | --------------- | ------------- | ------------------------------ | | spend\_tstamp | TIMESTAMP | Required | | channel | STRING | Optional (Channel or Campaign) | | campaign | STRING | Optional (Channel or Campaign) | | spend | NUMERIC | Required | 2. Update your `dbt_project.yml` file to configure the variable to the location of your spend table. ```yaml vars: snowplow_attribution: snowplow__spend_source: 'your_schema.marketing_spend_source' ``` 3. Run the dbt package ```yaml dbt run ``` 4. Congratulations, your `attribution_overview` view within your `_derived` schema should now reference your marketing spend source to provide a ROAS calculation! > **Info:** Please Note: By default in the `attribution_overview` view created by the package spend is allocated associated with a channel/campaign for the 90 days prior to the conversion. You can configure this by overriding the macro within dbt. --- # Optional: Configure custom channel group classifications > Customize channel grouping for attribution analysis by overriding dbt macros in the Unified or Attribution package. Define custom channel classifications to match your marketing taxonomy. > Source: https://docs.snowplow.io/tutorials/attribution/optional-configuring-channel-group-classification/ By default, the Attribution Package uses the channel grouping set within the Unified Package. It is recommended to set the channel grouping within the Unified Package, so you have matching channel groups across both packages. Optionally, you can choose to have differing channel groupings within the Attribution Package by customizing the macro only within the Attribution Package. > **Info:** You only need to follow **one** of the configuration guides below. ### Configuring within the Unified Package 1. Create a new file called `channel_group_query.sql` in your root dbt projects macro folder ![](/assets/images/Screenshot00-6644eabdb84ee5aa899f2ff1d594a1de.png) 2. Open [this link](https://github.com/snowplow/dbt-snowplow-unified/blob/main/macros/field_definitions/channel_group_query.sql) to find the GitHub link to the channel\_group\_query.sql macro within Unified Package. Within the file there are three macros, find the one relevant to you: 1. bigquery\_\_channel\_group\_query - BigQuery 2. redshift\_\_channel\_group\_query - Redshift 3. default\_\_channel\_group\_query - Any other Data Warehouse or Query Engine 3. Copy the relevant macro into the file you created within step 1. Ensure you copy the full macro. ```sql {% macro default__channel_group_query() %} ... {% endmacro %} ``` 4. Customize the `CASE` statement to capture your requirements. For example, you may want to set `Unassigned` to be called `Direct`. ```sql CASE ... else 'Direct' END ``` 5. Next time you run the Unified Package the macro you created will be prioritized over the built-in package macro. ### Configuring within the Attribution Package 1. Create a new file called `channel_classification.sql` in your root dbt projects macro folder ![](/assets/images/Screenshot31-f4804a0bf764df025ef3e2e429b37a14.png) 2. Paste the following into the file you created within step 1. ```sql {% macro default__channel_classification() %} default_channel_group {% endmacro %} ``` 3. Customize the macro to capture your requirements. See the example below on how you may want to set traffic to the channel `Unassigned` to be called `Direct`. The `default_channel_group` field is from the Unified Package. ```sql {% macro default__channel_classification() %} CASE WHEN default_channel_group = 'Unassigned' THEN 'Direct' ELSE default_channel_group END {% endmacro %} ``` 4. Next time you run the Attribution Package the macro you created will be prioritized over the built-in package macro. --- # Set up the Attribution dbt package locally > Install and configure the Snowplow Attribution dbt package locally. Set up conversion sources, attribution models, and generate channel and campaign attribution tables for marketing analysis. > Source: https://docs.snowplow.io/tutorials/attribution/setting-up-locally/ 1. Create a new dbt project in a new directory > **Info:** Please Note: Skip this step if you want to add the Attribution package to the same project as Unified. ```bash dbt init ``` 2. Create a `packages.yml` file in the same directory as the newly created `dbt_project.yml` > **Info:** Please Note: Skip this step if you want to add the Attribution package to the same project as Unified. 3. Add the following code to your `packages.yml` file ```yml packages: - package: snowplow/snowplow_attribution version: 0.4.0 ``` 4. Run the following command to install the package ```bash dbt deps ``` 5. Now open your `dbt_project.yml` where we will configure the package. A full list of configuration options can be seen on [this page](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/attribution/). 6. Add the following to your `dbt_project.yml` file and configure it accordingly for your use-case. ```yml vars: snowplow_attribution: # Attribution Package configuration snowplow__attribution_start_date: 'YYYY-MM-DD' # Required, the start date for conversions to be processed from. snowplow__attribution_list: ['first_touch', 'last_touch', 'linear', 'position_based'] # Optional, by default all are calculated # Configure input sources snowplow__conversion_path_source: 'my_schema_derived.snowplow_unified_views' # Location of your snowplow_unified_views table snowplow__conversions_source: 'my_schema_derived.snowplow_unified_conversions' # Location of your snowplow_unified_conversions table ``` > **Info:** Please Note: If you are running Unified in the same project as Attribution and intend to run them in the same run, set the conversion\_path\_source and conversions\_source to the following so that dbt can run the models in the correct order: ```yml vars: snowplow_attribution: snowplow__conversion_path_source: "{{ ref('snowplow_unified_views') }}" snowplow__conversions_source: "{{ ref('snowplow_unified_conversions') }}" snowplow__user_mapping_source: "{{ ref('snowplow_unified_user_mapping') }}" ``` 7. Run the dbt package ```yml dbt run --select snowplow_attribution ``` > **Info:** Please Note: If you are running Unified in the same project as Attribution and intend to run them in the same run, you can execute dbt run instead, but better handle it through a custom selectors.yml file 8. After running the package you should see the following tables in your `my_schema_derived` schema: - `SNOWPLOW_ATTRIBUTION_PATHS_TO_CONVERSION` - Row per conversion that shows the channel and campaign path, both the raw path and the transformed path. - `SNOWPLOW_ATTRIBUTION_CHANNEL_ATTRIBUTIONS` - Row for each conversion and each channel step to show how much revenue is attributed to that channel step for each attribution methodology. - `SNOWPLOW_ATTRIBUTION_CAMPAIGN_ATTRIBUTIONS` - Row for each conversion and each campaign step to show how much revenue is attributed to that campaign step for each attribution methodology. 1. Additionally, you will have a `attribution_overview` view in your `my_schema_derived` that is useful to ingest directly into a BI tool to provide a summary of attributed conversions and revenue by attribution method and touchpoint (by default it shows the last 30 days, however this can be configured). See the optional guide below on how to integrate marketing spend data to calculate ROAS within the attribution overview view. --- # Run the Attribution dbt package using Snowplow Console > Configure the Attribution data model in Snowplow Console with a fully managed service. Set attribution start dates, path lookback windows, and schedule automated model runs. > Source: https://docs.snowplow.io/tutorials/attribution/setting-up-via-console/ Once you are happy with how your local configuration is working, you will want to schedule it to run regularly. Snowplow provides a fully managed service for running data models in securely. To get started, follow these steps: 1. Commit the local configuration you built previously to a Git repository. 2. Create a Git connection to your repository. In Console, navigate to **Destinations > Connections > Set up connection > Git connection** 3. Create a warehouse connection. In Console, navigate to **Destinations > Connections > Set up connection > Data modeling connection** 4. Create a model project. In Console, select **Modeling > Model projects > Create model project** 5. Add a run configuration with a schedule to your model project. Click **Add run configuration** and **Add schedule**. Pick your preferred schedule, e.g. daily at 00:00 AM. For a full reference on running data models via Snowplow Console, see [Run data models from Console](/docs/modeling-your-data/running-data-models-via-console/). --- # Add event specifications to the todo tracking plan > Create event specifications with implementation instructions, entity cardinality rules, and trigger details. Define precise expectations for custom events and make them available through Snowtype code generation. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/add-event-specs/ Now you can create the Event Specification representing the todo addition. Click `Create new event specification`, fill the event information modal with the following inputs and click `Save and continue`. ![](/assets/images/start-add-todo-1ed91f0215ecfcc6cabad85de9f6a9c1.png) The next step is to add the event Data Structure `todo_action` you created previously to represent the todo addition interaction. Select the Event Specification and on the `Event data structure` panel click `Add existing data structure` and select the `todo_action` custom Data Structure. ![](/assets/images/todo-action-77d7ebfbdbadf4fcf7f1b68a1f7f33a2.png) To make sure your intention on this Event Specification is clear and also instructions for implementation are as precise as possible, you should go ahead and add instructions for this event. Click the `Add instructions` button and click `Edit` on the `value` attribute. Now you can fill the instructions with the following inputs, indicating exactly what is expected for this event. ![](/assets/images/implementation-instructions-667f4e45a2a336cfa426f1cb360675bb.png) > **Tip:** Information for implementation instructions, cardinality rules and trigger details will be available for the implementation engineers directly through Snowtype [instructions feature](/docs/event-studio/implement-tracking/snowtype/using-the-cli/#generating-event-specification-instructions). Now on the Entity data structures section, click `Add existing data structure`, find and select the todo entity created earlier. On the next modal step you can define the expected cardinality, of this entity on the event specification. For the todo entity, you want to have exactly 1 instance. ![](/assets/images/entity-cardinality-59314c191b771cbe52ac8f39bec15bd3.png) Finally you should add a trigger to represent exactly the conditions when this event should be fired. For this case, you expect the event to be fired when a user hits the Enter key after adding a title for the todo. ![](/assets/images/trigger-70bafef9a5e2cc969a93262e2fda6f91.png) **A similar process can be taken to create both the completion and removal Event Specifications.** --- # Create a base tracking plan from the web template in Console > How to create a base tracking plan using the Base Web template in Snowplow Console and connect it to a source application. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/create-base-dp/ After creating a Source Application, the recommended way to keep track of what we consider base events for a tracking setup is through the Base tracking plan templates. For this application you can use the [Base Web](/docs/event-studio/tracking-plans/templates/#base-web). To create a Base Web tracking plan for your application, navigate to the Tracking Plans section and after clicking the `Templates` button, select the Base Web template. ![](/assets/images/create-dp-1b8065bc8869fba27caad81fb0108b31.png) By default, a Base tracking plan is not connected to a Source Application and will show all the Base Event application IDs, so for this you need to edit the tracking plan and set the Source Application to `Todo Web Application`. ![](/assets/images/create-base-dp-inputs-b5e696d2fd1ff18edf58f0457cce57d6.png) The Base Web (_or Mobile_) tracking plan will monitor and count base events as they are sent for the selected Source Application app IDs, as we will see later on. There is no need for additional implementation. --- # Create a custom tracking plan for todo behavior > How to create a custom tracking plan in Snowplow Console with event specifications for tracking todo interactions. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/create-custom-dp/ As in every application, there are interactions that are important to the business and are measured in a custom way. For this application the interactions you need to measure are: - Adding a todo - Completing a todo - Removing a todo We are going to show how you can create the tracking plan for the goal interactions and the Event Specification for a todo addition. ### Create the custom tracking plan To start off, navigate to the Tracking Plans section and click through the `Create` button and `Start` to create a new custom tracking plan. On the basic information screen you can add the inputs as shown below and also select the `Todo Web Application` as the connected Source Application. You can name the tracking plan something similar to `Todo - Goals`. ![](/assets/images/create-todo-goals-84c7ef88e9a43363010c42af0bcbf963.png) Finally click `Create and continue`. > **Tip:** When connecting a Source Application to a tracking plan there are a few benefits you get automatically: > > - All Event Specifications inherit the expected app IDs from the Source Application. _That way you can be sure that an Event Specification is fired in all environments it needs to._ > - All Event Specifications automatically reference the Application Context Data Structures that are expected to be available with each event. _This prevents duplicate information and also understanding which context is going to be available in addition to your Event Specification entities._ --- # Create a source application for tracking plans > How to create a source application in Snowplow Console, including setting application IDs and application context entities. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/create-source-application/ The basis for every tracking setup is the Source Application. It represents the tracking estate, in a specific platform (_web, Android, iOS, etc._), for an application which in our case would be named _'Todo Web Application'_. Navigate to the Source Applications section and click the `Add a source application` button. ![](/assets/images/add-sap-992a308efd2fc2718eb7bed71f09d74f.png) The inputs on the creation screen are: - **Name**: a name used for this application which is fairly understood across the organization. - **Description**: a few words on what this Source Application represents and/or any notes. - **Primary owner**: owner email address. - **Application IDs**: the [Application IDs](https://docs.snowplow.io/docs/understanding-tracking-design/organize-data-sources-with-source-applications/#application-ids) expected to be used for this application. _These will automatically flow down to the tracking plans and Event specifications you define relating to this Source Application._ - **Application entities**: here is where you will set the [Application Contexts](https://docs.snowplow.io/docs/understanding-tracking-design/organize-data-sources-with-source-applications/#application-context) you will implement and expect to be available with every event hit coming from this application. An example of inputs can be the following which you can adjust to your case: ![](/assets/images/filled-sap-721f9176201822acaba16fe46bb84fde.png) --- # Generate tracking code from tracking plans with Snowtype > Generate type-safe tracking code and implementation instructions with Snowtype. Use generated APIs to track event specifications and verify events with Snowplow Inspector and Console volume counts. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/generate-tracking-code/ On your terminal now run `npx @snowplow/snowtype@latest generate --instructions`. If all is as expected, after a few seconds, you will have the following files generated: ![](/assets/images/fs-93f506090adab7da1c8deb07e7f41fb8.png) The `src/tracking/instructions.md` file includes detailed instructions and information about the Event Specifications to be implemented while `src/tracking/snowplow.ts` contains all the required code to be used to track the Event Specifications. > **Info:** In some editors like [Visual Studio Code](https://code.visualstudio.com/), the APIs that are available in a project are shown to the developer as they type. For Snowtype exposed APIs to track Event Specifications or event Data Structures start with `track` and then the name of the Data Structure or Event Specification. For Event Specification APIs, there is also the suffix of `Spec` or `spec` depending on the language. E.g. for our custom tracking plan, we have available the `trackAddTodoSpec`, `trackCompleteTodoSpec` and `trackRemoveTodoSpec` methods. ## Tracking interactions To track interactions such as adding a new todo, you can add the following piece of code at `src/pagesTodo/components/Header.tsx` ```diff import { useState } from "react"; import { v4 } from "uuid"; import { Todo } from "../../../types"; +import { createTodo, trackAddTodoSpec } from "../../../tracking/snowplow"; // Rest of the code... function addItem(e: React.KeyboardEvent<HTMLInputElement>) { if (e.key === "Enter" && value) { addTodo((preItems) => { return [ { id: v4(), value, completed: false, }, ...preItems, ]; }); + trackAddTodoSpec({ + action: "add", + context: [createTodo({ title: value })], + }); setValue(""); } } ``` Finally, you can go to the application, add a new todo and observe on the Snowplow Inspector extension as the addition Event Specification is being sent. The event contexts should include the `todo`, `event_specification` and any other extra context you might add. ![](/assets/images/inspector-spec-db0b4a3b905faf7badb73b525180fe05.png) After a while the Event Specification volume counts for each event will be available at the tracking plans screen. ![](/assets/images/custom-dp-volumes-0a0736408362506989e7342543aff991.png) You can checkout the completed code at the ['implemented' branch](https://github.com/snowplow-incubator/data-products-basic-tracking-recipe/tree/implemented). --- # Track page views, activity, and link clicks with the Browser tracker > Install and configure the Snowplow Browser tracker to capture page views, activity tracking with page pings, and automatic link click tracking using the link click plugin. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/setup-basic-tracking/ For this example application, we will use the [Browser tracker](/docs/sources/web-trackers/quick-start-guide/) which is distributed through npm. Switch to the project root directory and then install it by running `npm install @snowplow/browser-tracker`. Next, add the following piece of code at `src/main.tsx`. ```diff import Todo from "./pages/Todo"; import "./styles.css"; +import { + enableActivityTracking, + newTracker, + trackPageView, +} from "@snowplow/browser-tracker"; +newTracker("t1", "{{COLLECTOR_URL}}", { + appId: "todo-web-dev", +}); +enableActivityTracking({ + minimumVisitLength: 30, + heartbeatDelay: 10, +}); +trackPageView(); createRoot(document.getElementById("root")!).render( ``` _For this showcase, placing the initialization code at the main file is enough._ What this code does is: 1. Initializes the tracker with the app ID representing the Todo web application in the development environment. 2. Enables activity tracking which will send periodic page pings. 3. Sends a page view when the main application component is first rendered. You can validate this step being implemented properly using the [Snowplow Inspector](/docs/testing/snowplow-inspector/) browser extension observing Page view and Page ping events. ![](/assets/images/inspector-53e6adf2645e1fa0593a30ea2eff80c1.png) ## Add link click tracking As a next step you will implement link click tracking for the main page link pointing to the TodoMVC website. To track this and other links on your pages, you can install the [Link click](/docs/sources/web-trackers/tracking-events/link-click/) tracking plugin. The plugin provides automatic link click tracking for all links on your page. To enable it in the application, switch to the project root directory and then install it by running `npm install @snowplow/browser-plugin-link-click-tracking`. Next, add the following piece of code at `src/main.tsx`. ```diff import { enableActivityTracking, newTracker, trackPageView, } from "@snowplow/browser-tracker"; +import { + LinkClickTrackingPlugin, +} from "@snowplow/browser-plugin-link-click-tracking"; newTracker("t1", "{{COLLECTOR_URL}}", { appId: "todo-web-dev", + plugins: [LinkClickTrackingPlugin()], }); ``` And the following on `src/pages/index.tsx`. ```diff import Info from "./components/Info"; +import { enableLinkClickTracking } from "@snowplow/browser-plugin-link-click-tracking"; +import { useEffect } from "react"; const Todo = function () { + useEffect(() => { + enableLinkClickTracking(); + }, []) ``` With this code, all links that are initially rendered on the page will be tracked automatically. You can verify this using the Snowplow inspector browser extension observing link click events after clicking the TodoMVC link. ![](/assets/images/inspector-link-32d06e32417a66782f3b57096a25e659.png) --- # Create custom data structures for event specifications > Define custom data structures for event and entity schemas specific to your application. Create todo_action and todo data structures to represent user interactions with specific properties and validations. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/setup-custom-ds/ Before you create the custom tracking plan for these interactions, you need to create a couple of Data Structures, `todo` and `todo_action`, fitting the use case of the Todo web application. #### Todo action Data Structure ```json { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "type": "object", "self": { "vendor": "com.your.organization", "name": "todo_action", "format": "jsonschema", "version": "1-0-0" }, "description": "Event Data Structure representing an action taken for a todo", "properties": { "action": { "type": "string", "enum": [ "add", "remove", "complete" ], "description": "The action taken for a specific todo item" } }, "required": [ "action" ], "additionalProperties": false } ``` #### Todo Data Structure ```json { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "type": "object", "self": { "vendor": "com.your.organization", "name": "todo", "format": "jsonschema", "version": "1-0-0" }, "description": "Entity Data Structure representing a todo", "properties": { "title": { "type": "string", "description": "The title of the Todo" } }, "required": [ "title" ], "additionalProperties": false } ``` _Note: You might need to publish those to the production environment depending on the pipeline you are using._ --- # Install Snowtype to enable code generation from tracking plans > Install and initialize Snowtype CLI to generate tracking APIs from event specifications. Automatically attach event_specification entities and enable event count features without additional implementation work. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/snowtype/ After defining your Event Specifications, the next step is implementing those events on your app. To do that you are going to use Snowtype to generate the required APIs for tracking the Event Specifications. > **Info:** For some Event Specification features, such as event counts, it is required that events are tracked together with the `event_specification` entity. Snowtype automatically attaches the entity without any additional work required from the implementation engineers. ## Installing Snowtype After having set up a [Console API key](/docs/event-studio/implement-tracking/snowtype/using-the-cli/#authenticating-with-the-console) you can install Snowtype on this project by switching to the project root directory and running `npm install @snowplow/snowtype@latest --save-dev`. Since this is a project without previous Snowtype installation, we need to go through the [init flow](/docs/event-studio/implement-tracking/snowtype/using-the-cli/#initializing-snowtype-for-your-project). To do that, you can go to the tracking plan page and click on the `Implement tracking` button. There you can copy the second code command which relates to initializing a new Snowtype project. ![](/assets/images/sntp-init-b5daa7c6f5f4fd1508a487a3ef80b670.png) The inputs should look like the following: ![](/assets/images/sntp-init-inputs-0b17b17c6e55a3c45786ef4825a425f1.png) Next you add this tracking plan to the Snowtype project by copying the first code command. ![](/assets/images/sntp-patch-859dc88996d2b0cb3262be80ac075473.png) Now your Snowtype configuration file should include the tracking plan in the `dataProductIds` array. --- # Learn how to track web events with base tracking plans > Learn how to implement web tracking using Event Studio, source applications, and Snowtype code generation for a React TodoMVC application. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/start/ This guide will help you understand some of the basic capabilities of tracking plans and how they can be used in practice for most tracking implementation setups. ## Prerequisites - A [collector](/docs/pipeline/collector/) endpoint. - A [Console API key](/docs/event-studio/implement-tracking/snowtype/using-the-cli/#authenticating-with-the-console) for generating code using Snowtype. ## What you'll be doing This recipe will showcase how a basic tracking setup can be implemented using Event Studio capabilities such as Source Applications, tracking plans and their related Snowtype features. This basic tracking setup will include: - Page views and page pings - Link clicks - Custom events For demonstration purposes we are going to be using a [TodoMVC](https://todomvc.com/) clone built with [React.js](https://react.dev/). If you want to follow along you can: 1. Clone the repository using`git clone git@github.com:snowplow-incubator/data-products-basic-tracking-recipe.git`. 2. Change into the project directory and install the dependencies using `npm install`. 3. Run the development server using `npm run dev`. 4. Open <http://localhost:5173/> to see the app. --- # Verify that your application is sending the expected tracking plan events > Verify that your base tracking plan is receiving events from the correct environment by checking event volumes in the Event Studio Console. > Source: https://docs.snowplow.io/tutorials/data-products-base-tracking/verify-dp-events/ As mentioned previously, if you check the `Todo - Base Web` tracking plan you created, you will be able to see the events coming in from the correct environment based on the app ID. _Note: You might need to wait for a bit, up to a maximum of 2 hours, until the events are visible._ ![](/assets/images/base-counts-febf7ebeac5fdb5d0428b0cb307a35af.png) --- # Automate data structure workflows with GitHub Actions > Set up GitHub Actions workflows to automatically validate pull requests and publish data structures to development and production environments. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/automate-with-github-actions/ We'll not go into the details of creating github repositories and initial commits here, the [github docs](https://docs.github.com/) do an excellent job of that already. The next few steps will assume a working github repository containing the directory and data structure we created in the previous section. It will have two branches named `main` and `develop` which should be in sync. ## Publish to develop workflow We would like pushes to our `develop` branch to be automatically published to our [development](/docs/testing/) environment. Github workflows can be [triggered](https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/triggering-a-workflow) by all kinds of repository events. The one we are interested in here: ```yml on: push: branches: [develop] ``` With our trigger point worked out we need to complete a series of steps: 1. Configure snowplow-cli via environment variables provided as [github action secrets](https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions) 2. Checkout our repo 3. Install snowplow-cli. We'll use our [setup-snowplow-cli](https://github.com/snowplow/setup-snowplow-cli) github action here. Behind the scenes it is downloading the [latest](https://github.com/snowplow/snowplow-cli/releases/latest) snowplow-cli release and making it available via the workflow job's `path`. 4. Run the `snowplow-cli ds publish dev` command as we did earlier The full action: ```yml on: push: branches: [develop] jobs: publish: runs-on: ubuntu-latest env: SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }} SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }} SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }} steps: - uses: actions/checkout@v4 - uses: snowplow/setup-snowplow-cli@v1 - run: snowplow-cli ds publish dev --managed-from $GITHUB_REPOSITORY ``` > **Tip:** The value of the `--managed-from` flag will be displayed inside the 'This data structure is locked' banner we saw earlier in the UI. It is designed to help people track down the source of truth for this data structure. ## Publish to production workflow In the same way we want our `develop` branch to deploy to our `develop` environment we want our `main` branch to deploy to our `production` environment. As we saw earlier publishing to production is very similar to publishing to development. The only new thing we need here is a different workflow trigger. ```yml on: push: branches: [main] jobs: publish: runs-on: ubuntu-latest env: SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }} SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }} SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }} steps: - uses: actions/checkout@v4 - uses: snowplow/setup-snowplow-cli@v1 - run: snowplow-cli ds publish prod --managed-from $GITHUB_REPOSITORY ``` ## Validate on pull request workflow A core component of version control based workflows is the [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests). For our repository we would like to ensure as best we can that any data structure changes are valid and problem free before they get merged into develop. Lucky for us there is a github workflow event for that. By combining the `snowplow-cli ds validate` command and the github workflow pull request event we arrive at this: ```yml on: pull_request: branches: [develop, main] jobs: validate: runs-on: ubuntu-latest env: SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }} SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }} SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }} steps: - uses: actions/checkout@v4 - uses: snowplow/setup-snowplow-cli@v1 - run: snowplow-cli ds validate --gh-annotate ``` > **Tip:** The `--gh-annotate` flag will make the validate command output [github workflow command](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions) compatible output. We'll see an example of what that looks like in the next section. --- # Conclusions from the data structures in Git tutorial > Complete your Git-based data governance workflow with version control, automated validation, approval workflows, and environment promotion for data structures. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/conclusion/ - We have seen how snowplow-cli can be used to work with data structures from the command line - We have applied that knowledge to build github workflows which support automated validation and publication - We have added source applications, tracking plans and event specifications to use the same approach as data structures. This approach provides you with: - **Version control** for your data structures with full change history - **Automated validation** to catch errors before they reach production - **Approval workflows** through pull requests - **Environment promotion** from development to production - **Integration with tracking plans** for comprehensive data governance You now have a complete CI/CD pipeline for managing your Snowplow data structures using industry-standard tools and practices. --- # Add GitHub workflow support for tracking plans > Extend your Git workflow to manage source applications, tracking plans, and event specifications using Snowplow CLI and GitHub Actions automation. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/follow-up-with-data-products/ > **Info:** We originally called tracking plans "data products". You'll still find the old term used in some existing APIs and CLI commands. Now that we have our data structures set up, we can define tracking plans to organize and document how these structures are used across our applications. We'll walk through creating source applications, tracking plans, and event specifications using the CLI, then integrate them into our automated workflows. ## Create a source application First, we'll create a source application to represent our website that will send the `login` event we defined earlier. ```bash snowplow-cli dp generate --source-app website ``` > **Note:** `dp` is an alias for `data-products`, from the previous name for tracking plans. Source applications and event specifications are also managed by this command. This should provide the following output ```text INFO generate wrote kind="source app" file=data-products/source-apps/website.yaml ``` The generated file is written to the default `data-products/source-apps` directory. Help for all the arguments available to `generate` is available by running `snowplow-cli dp generate --help`. Let's examine the generated file: ```yml apiVersion: v1 resourceType: source-application resourceName: b8261a25-ee81-4c6a-a94c-7717ba835035 data: name: website appIds: [] entities: tracked: [] enriched: [] ``` - `apiVersion` should always be `v1` - `resourceType` should remain `source-application` - `resourceName` is a unique identifier of the source applications. It must be a valid uuid v4 - `data` is the contents of the source app > **Note:** For more information about available fields and values you can refer to the [source applications schema](https://raw.githubusercontent.com/snowplow/snowplow-cli/main/internal/validation/schema/source-application.json). Making your ide schema aware via a [language server](https://github.com/redhat-developer/yaml-language-server) should provide a much smoother editing experience. Now let's customize our source application. We'll configure it to handle events from our production website as well as staging and UAT environments. We'll also add an owner field and remove the unused entities section. ```yml apiVersion: v1 resourceType: source-application resourceName: b8261a25-ee81-4c6a-a94c-7717ba835035 data: name: website appIds: ["website", "website-stage", "website-ua"] owner: me@example.com ``` Before syncing, we can validate our changes and preview what will happen: ```bash snowplow-cli dp sync --dry-run ``` The command will show us the planned changes: ```text sync will create source apps file=.../data-products/source-apps/website.yaml name=website resource name=b8261a25-ee81-4c6a-a94c-7717ba835035 ``` When we're happy with the proposed changes, we can sync by removing the `--dry-run` flag: ```bash snowplow-cli dp sync ``` After syncing, you'll be able to see your new source application in the Snowplow Console UI. ## Create a tracking plan and an event specification Let's now create a tracking plan and an event specification by running the following command ```bash snowplow-cli dp generate --data-product Login ``` This should provide the following output ```text INFO generate wrote kind="data product" file=data-products/login.yaml ``` Let's see what it has created for us ```yml apiVersion: v1 resourceType: data-product resourceName: 0edb4b95-3308-40c4-b266-eae2910d5d2a data: name: Login sourceApplications: [] eventSpecifications: [] ``` > **Note:** For more information about available fields and values you can refer to the [tracking plans schema](https://raw.githubusercontent.com/snowplow/snowplow-cli/main/internal/validation/schema/data-product.json). Making your ide schema aware via a [language server](https://github.com/redhat-developer/yaml-language-server) should provide a much smoother editing experience. Let's amend it to add an event specification, and a reference to a source application: ```yml apiVersion: v1 resourceType: data-product resourceName: 0edb4b95-3308-40c4-b266-eae2910d5d2a data: name: Login owner: me@example.com description: Login page sourceApplications: - $ref: ./source-apps/website.yaml eventSpecifications: - resourceName: cfb3a227-0482-4ea9-8b0d-f5a569e5d103 name: Login success event: source: iglu:com.example/login/jsonschema/1-0-1 ``` > **Note:** You'll need to come up with a valid uuid V4 for the `resourceName` of an event specification. You can do so by using an [online generator](https://www.uuidgenerator.net), or running the `uuidgen` command in your terminal > **Warning:** The `iglu:com.example/login/jsonschema/1-0-1` data structure has to be deployed at least to a develop envinroment. Currently referencing local data structures is not supported We can run the same `sync --dry-run` command as before, to see if the output is as expected. The output should contain the following lines ```bash snowplow-cli dp sync --dry-run ``` ```text INFO sync will create data product file=.../data-products/login.yaml name=Login resource name=0edb4b95-3308-40c4-b266-eae2910d5d2a INFO sync will update event specifications file=.../data-products/login.yaml name="Login success" resource name=cfb3a227-0482-4ea9-8b0d-f5a569e5d103 in data product=0edb4b95-3308-40c4-b266-eae29 ``` We can apply the changes by using the sync command without the `--dry-run` flag ```bash snowplow-cli dp sync ``` ## Add tracking plans validation and syncing in the GitHub Actions Now that we've modeled a source application, tracking plan and event specification, let's see how we can add them to the existing GitHub Actions workflows for data structures. You can customize your setup, use a separate repository or separate actions, but in this example we'll add tracking plan syncing and releasing into the existing workflows. Let's modify the PR example, and add the following line. This command will validate and print the changes to the GitHub Actions log. ```yml on: pull_request: branches: [develop, main] jobs: validate: runs-on: ubuntu-latest env: SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }} SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }} SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }} steps: - uses: actions/checkout@v4 - uses: snowplow/setup-snowplow-cli@v1 - run: snowplow-cli ds validate --gh-annotate - run: snowplow-cli dp sync --dry-run --gh-annotate ``` Tracking plans, source applications and event specifications don't have the dev and prod environments, so it's enough to sync them once. When merging to `develop`, use `sync` to push your changes as drafts. When merging to `main`, use `release` to also mark event specifications as published. Add the `sync` command to the develop pipeline: ```yml on: push: branches: [develop] jobs: publish: runs-on: ubuntu-latest env: SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }} SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }} SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }} steps: - uses: actions/checkout@v4 - uses: snowplow/setup-snowplow-cli@v1 - run: snowplow-cli ds publish dev --managed-from $GITHUB_REPOSITORY - run: snowplow-cli dp sync ``` Add the `release` command to the production pipeline to publish event specifications when merging to `main`: ```yml on: push: branches: [main] jobs: publish: runs-on: ubuntu-latest env: SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }} SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }} SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }} steps: - uses: actions/checkout@v4 - uses: snowplow/setup-snowplow-cli@v1 - run: snowplow-cli ds publish prod --managed-from $GITHUB_REPOSITORY - run: snowplow-cli dp release ``` --- # Learn how to manage data structures with Git > Manage Snowplow data structures using Git and Snowplow CLI for version control, approval workflows, and automated validation with GitHub Actions. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/introduction/ Snowplow data structures are the artifacts defining the rules for event validation within the Snowplow data pipeline. As such, they are a description of event shapes that the pipeline will allow through, and essentially the basis for the shape of data in the warehouse. The fact that data structures formalize what the warehouse tables look like makes them a cornerstone of Snowplow's facilities for data governance. An unintended change in a data structure could result in data consumers down the line being unable to process that data (e.g. data models breaking). That is why larger organizations guard Data Structure definitions closely, and need approval workflows to allow or disallow changes. For instance, a Data Protection Officer may want to have the final say about collected events, to ensure no PII is harvested. On top of that, detailed change history can be crucial for such large organizations. It is important to be able to tell in fine detail what was changed, when, and by whom. The Snowplow Console's UI has facilities to get started quickly with data structures (either using the Builder or the direct JSON editor), and is a solid tool for smaller teams. It doesn't implement such approval processes, neither does it offer such fine-grained visibility around changes. A common solution when faced with these requirements is to move management to some form of version control platform (github/gitlab). This opens up an entire ecosystem of tools and patterns enabling all manner of custom workflows. We have built [Snowplow CLI](/docs/event-studio/programmatic-management/snowplow-cli/) to help you bridge the gap between these repository-based workflows and Snowplow Console. ## Prerequisites - A deployed Snowplow pipeline - [Snowplow CLI](/docs/event-studio/programmatic-management/snowplow-cli/) installed and configured - A familiarity with [git](https://git-scm.com/) and an understanding of [GitHub Actions](https://docs.github.com/en/actions/writing-workflows) - A sensible [terminal emulator](https://en.wikipedia.org/wiki/Terminal_emulator) and shell ## What you'll be doing This tutorial will walk through creating and deploying a data structure from the command line using [Snowplow CLI](/docs/event-studio/programmatic-management/snowplow-cli/). It will then show how it is possible to automate the validation and deployment process using [GitHub Actions](https://docs.github.com/en/actions/writing-workflows). --- # Create a local data structure with Snowplow CLI > Generate a new data structure locally using Snowplow CLI with vendor and schema information in YAML format. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/local-setup/ Firstly we'll need a place to put things. ```bash $ mkdir -p snowplow-structures/data-structures $ cd snowplow-structures ``` > **Tip:** snowplow-cli data structures commands default to looking for data structures in `./data-structures`. Now let's create our data structure. We'll create a custom event called 'login'. ```bash $ snowplow-cli ds generate login --vendor com.example ``` > **Note:** `ds` is an alias for `data-structures`. This should provide us the following output ```text 3:00PM INFO generate wrote=data-structures/com.example/login.yaml ``` The generated file is written to our default `data-structures` directory under a sub directory matching the `--vendor` we supplied with a filename that mirrors the name we gave the data structure. Help for all the arguments available to `generate` is available by running `snowplow-cli ds generate --help`. > **Note:** This directory layout and file naming scheme is also followed by the [download](/docs/event-studio/programmatic-management/snowplow-cli/data-structures/#downloading-data-structures) command. Let's see what it has created for us. ```yml apiVersion: v1 resourceType: data-structure meta: hidden: false schemaType: event customData: {} data: $schema: http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0# self: vendor: com.example name: login format: jsonschema version: 1-0-0 type: object properties: {} additionalProperties: false ``` - `apiVersion` should always be `v1` - `resourceType` should remain `data-structure` - `meta.hidden` directly relates to showing and hiding [in Console UI](/docs/event-studio/data-structures/#hide-a-data-structure) - `meta.schemaType` can be `event` or `entity` - `meta.customData` is a map of strings to strings that can be used to send across any key/value pairs you'd like to associate with the data structure - `data` is the actual [snowplow self describing schema](/docs/api-reference/iglu/common-architecture/self-describing-json-schemas/) that this data structure describes --- # Modify, validate, and publish the data structure using Snowplow CLI > Validate data structure schemas for errors and warnings, then publish to development and production environments using Snowplow CLI. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/validation-and-publishing/ Firstly we'll add a property to our data structure definition. We'd like to know if a login succeeded or failed. Our modified `login.yaml` should look like this ```yml apiVersion: v1 resourceType: data-structure meta: hidden: false schemaType: event customData: {} data: $schema: http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0# self: vendor: com.example name: login format: jsonschema version: 1-0-0 type: object properties: result: enum: [success, failure] additionalProperties: false ``` ## Validate We should validate our changes before we attempt to publish them. Let's do that ```bash $ snowplow-cli ds validate data-structures/com.example/login.yaml ``` > **Tip:** You can supply snowplow-cli with a directory and it will look for anything that looks like a data structure. Also given the default data structure directory is being used the previous command is equivalent to `snowplow-cli ds validate`. You should see output similar to this: ```text 3:00PM INFO validating from paths=[data-structures/com.example/login.yaml] 3:00PM INFO will create file=data-structures/com.example/login.yaml vendor=com.example name=login version=1-0-0 3:00PM WARN validation file=data-structures/com.example/login.yaml messages= │ The schema is missing the "description" property (/properties/result) │ The schema is missing the "description" property (/) ``` ## Publish to development Apart from the missing descriptions everything looks good. We can fill them in later. Let's go ahead and publish our data structure to our [development](/docs/testing/) environment. ```bash $ snowplow-cli ds publish dev ``` > **Tip:** We omit the directory here but as with other commands the default directory will get used and it will attempt to publish any data structures it can find. The command should output something close to the following: ```text 3:00PM INFO publishing to dev from paths=[data-structures] 3:00PM INFO will create file=data-structures/com.example/login.yaml vendor=com.example name=login version=1-0-0 3:00PM WARN validation file=data-structures/com.example/login.yaml messages= │ The schema is missing the "description" property (/properties/result) │ The schema is missing the "description" property (/) 3:00PM INFO all done! ``` > **Note:** Publishing to `dev` will also run validation. It will only fail on ERROR notifications. You should now be able to see your published data structure in [Console UI](https://console.snowplowanalytics.com/data-structures). If you click through from the data structure listing to view the `login` data structure you should see the following banner. ![](/assets/images/locked-57bc4e2e7e71494600acaa1b219d5b2c.png) Any data structures published using snowplow-cli will automatically get this banner and have UI based editing disabled. It is a good idea to settle on one source of truth for each data structure to avoid potential conflicts. ## Publish to production With our data structure deployed to develop and working as we expect we can safely publish it to production. ```bash $ snowplow-cli ds publish prod ``` ```text 3:00PM INFO publishing to prod from paths=[data-structures] 3:00PM INFO will update file=data-structures/com.example/login.yaml local=1-0-0 remote="" 3:00PM INFO all done! ``` > **Note:** Data structures must be published to `dev` before they can be published to `prod` We have now seen how to create, validate and then publish a new data structure from the command line. Next we'll look at how to configure github actions to run validation and publishing automatically for us. --- # Verify data structures GitHub Actions automation workflow > Test your GitHub Actions workflow by creating a pull request with data structure changes and verifying automated validation and deployment. > Source: https://docs.snowplow.io/tutorials/data-structures-in-git/verify-github-setup/ Now we have our workflows in place let's work through an example. Our login data structure needs some attention. Our requirements have changed and rather than 'success' and 'failure' the login result will now need to report numbers and not strings. So instead of `[success, failure]` it'll be `[200, 403]`. Having created a [new branch](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging) called `login-results-error-codes` and making the changes locally we should end up here: ```bash --- a/data-structures/com.example/login.yaml +++ b/data-structures/com.example/login.yaml @@ -10,9 +10,9 @@ data: vendor: com.example name: login format: jsonschema - version: 1-0-0 + version: 1-0-1 type: object properties: result: - enum: [success, failure] + enum: [200, 403] additionalProperties: false ``` That all looks good so we'll go ahead and push to github and [create a pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). We wait patiently for our validate on pull request workflow to run. ![](/assets/images/worked-pr-checks-617ef11e0920c15371b3f53399ddb864.png) Validation has failed. To identify the problem we open the 'file' tab on the pull request and see.. ![](/assets/images/worked-diff-annotated-b2288d79b1056cbcf9160efc0d829925.png) > **Note:** Validation only takes your configured [destinations](https://console.snowplowanalytics.com/destinations) into account. Together with the description warnings we forgot to fix earlier we have some errors. Changing the values of the enum would change the type of the `result` property which will cause problems further down the line for our data. The error suggests we need to make a major version bump to avert disaster. We'll do that (and add descriptions). Our next attempt: ```bash --- a/data-structures/com.example/login.yaml +++ b/data-structures/com.example/login.yaml @@ -10,9 +10,11 @@ data: vendor: com.example name: login format: jsonschema - version: 1-0-0 + version: 2-0-0 type: object + description: Login outcome event properties: result: - enum: [success, failure] + description: The resulting http error code of a login request + enum: [200, 403] additionalProperties: false ``` And the workflow result.. ![](/assets/images/worked-pr-checks-ok-8029034775bea79ec28a93f51e200fa2.png) Validation has passed. Now our colleagues can feedback on our changes and if everyone is happy we can merge to `develop` which will trigger our `publish-develop.yml` workflow. ![](/assets/images/worked-pub-dev-8f0d2efaa5f2dbead6a73d788976404e.png) Finally, once we are convinced everything works we can open another pull request from `develop` to `main`, merge that and trigger our `publish-production.yml` workflow. --- # Optional: Add a new calculation to the Flink live shopper features project > Extend the Flink pipeline by adding a custom feature calculation to track the most viewed product brand using event data and window aggregations. > Source: https://docs.snowplow.io/tutorials/flink-live-shopper-features/add-calculation/ This page describes how to add an additional feature to the product window. The process would be similar for other parts of the system, such as cart or purchase features. The package `com.evoura.snowplow.operator.product` contains all classes related to the product features. The features used by the product windows are: 1. `product_view_count` 2. `avg_viewed_price` 3. `return_view_count` 4. `price_range_viewed` These are generated as metrics for Flink by the `ProductMetricsGeneratorRichFunction` class. To generate an additional feature, you will need to update several other classes first. In this example, you will track a new feature called `most_viewed_brand`, based on the user's most viewed product brand. ## 1. Add a field to `ProductViewEvent` The web application tracks data on product brand—it's a field within the Snowplow [ecommerce `product_view` event](/docs/sources/web-trackers/tracking-events/ecommerce/#product-view). A full example payload is available at `.example/product_view.json`, showing how the payload would look after processing through the Snowplow pipeline. The `brand` field isn't currently present in the `ProductViewEvent` class. Add `brand` as a `String` instance variable, after `timestamp`. The updated class should look like this: ```java package com.evoura.snowplow.operator.product; import com.evoura.operator.TimestampedEvent; public class ProductViewEvent implements TimestampedEvent { public String eventId; public String userId; public String productId; public double productPrice; public long timestamp; public String brand; public ProductViewEvent() {} public ProductViewEvent( String eventId, String userId, String productId, double productPrice, long timestamp, String brand) { this.eventId = eventId; this.userId = userId; this.productId = productId; this.productPrice = productPrice; this.timestamp = timestamp; this.brand = brand; } @Override public long getTimestamp() { return timestamp; } @Override public String toString() { return "ProductViewEvent{" + "eventId='" + eventId + '\'' + ", userId='" + userId + '\'' + ", productId='" + productId + '\'' + ", productPrice=" + productPrice + ", brand='" + brand + '\'' + '}'; } } ``` ## 2. Deserialize the information in `ProductViewEventMap` Because you updated the `ProductViewEvent` constructor, you need to update the `ProductViewEventMap` class as well. The class `com.evoura.snowplow.operator.product.ProductViewEventMap` is responsible for mapping a `SnowplowEvent` into a `ProductViewEvent`. On line 32, provide the `brand` argument as `productNode.path("brand").asText()`. The updated method should look like this: ```java @Override public ProductViewEvent map(SnowplowEvent snowplowEvent) throws Exception { String eventId = snowplowEvent.getEventId(); String userId = snowplowEvent.getUserId(); JsonNode payloadNode = MAPPER.readTree(snowplowEvent.payload); JsonNode productNode = payloadNode.path("product"); String productId = productNode.path("id").asText(""); JsonNode priceNode = productNode.path("price"); double productPrice = priceNode.isNumber() ? priceNode.asDouble() : 0.0; return new ProductViewEvent( eventId, userId, productId, productPrice, snowplowEvent.getTimestamp(), productNode.path("brand").asText()); } ``` ## 3. Calculate the most viewed brand You can now calculate which brand is the most viewed by the user. All the current features are already segmented by `user_id`, so the segmentation won't be part of this. Look at the class `com.evoura.snowplow.operator.product.ProductFeatureRollingWindow`. The method `ProductFeatureRollingWindow.processWindow` receives `Iterable<ProductViewEvent> events` as the first argument. This variable will contain all the `ProductViewEvent`s in the defined window. For products you can choose a 5 minute or a 1 hour window. Iterate through these items to calculate the most viewed brand by the user in this window. It will look like something like this: ```java String mostViewedBrand = uniqueEvents.values().stream() .collect( Collectors.groupingBy( productViewEvent -> productViewEvent.brand, Collectors.counting())) .entrySet() .stream() .max(Map.Entry.comparingByValue()) .map(Map.Entry::getKey) .orElse(""); ``` ## 4. Forward the `mostViewedBrand` downstream The value of the feature needs to be collected so that it's available for Flink. The class `com.evoura.snowplow.operator.product.ProductFeature` is responsible for forwarding all calculations that happened in `ProductFeatureRollingWindow.processWindow`. Start by updating the class to store the new `mostViewedBrand` feature as a `String` variable. ```java public class ProductFeature implements Serializable { public String userId; public int uniqueProductCount; public double averagePrice; public double minPrice; public double maxPrice; public String windowSize; public Map<String, Long> views; public String mostViewedBrand; public ProductFeature() {} public ProductFeature( String userId, int uniqueProductCount, double averagePrice, double minPrice, double maxPrice, String windowSize, Map<String, Long> views, String mostViewedBrand) { this.userId = userId; this.uniqueProductCount = uniqueProductCount; this.averagePrice = averagePrice; this.minPrice = minPrice; this.maxPrice = maxPrice; this.windowSize = windowSize; this.views = views; this.mostViewedBrand = mostViewedBrand; } @Override public String toString() { return String.format( "Window: %s, User: %s, Unique Products: %d, Avg Price: %.2f, Min Price: %.2f, Max Price: %.2f, Views: %s, Most Viewed Brand: %s", windowSize, userId, uniqueProductCount, averagePrice, minPrice, maxPrice, views, mostViewedBrand); } ... } ``` This class contains a `TypeInfoFactory` class. A `TypeInfoFactory` is a way to tell Flink what's being forwarded to the next operator. This is required because we decided to turn off Kryo deserialization on Flink. This trades some convenience for the long-term ability to perform rolling upgrades, rescale jobs, and keep historical state readable as your data model inevitably changes. Kryo's "black-box" binary format doesn't carry any structural information, so even a small change to a field name, order, or data type breaks compatibility and can corrupt state when jobs are upgraded. Flink's own type serializers (POJO, Avro, Protobuf, etc.) embed explicit schema metadata and versioning hooks, allowing the runtime to evolve the schema safely and transparently migrate state across savepoints. We chose POJO for this project to avoid requiring any external schema registry. Update the `ProductFeature` `TypeInfoFactory` to forward the `mostViewedBrand` feature. The desired output will be: ```java public class ProductFeature implements Serializable { ... public static class InfoFactory extends TypeInfoFactory<ProductFeature> { public static TypeInformation<ProductFeature> typeInfo() { InfoFactory factory = new InfoFactory(); return factory.createTypeInfo(null, null); } @Override public TypeInformation<ProductFeature> createTypeInfo( Type t, Map<String, TypeInformation<?>> genericParameters) { Map<String, TypeInformation<?>> fields = new HashMap<>(); fields.put("userId", Types.STRING); fields.put("uniqueProductCount", Types.INT); fields.put("averagePrice", Types.DOUBLE); fields.put("minPrice", Types.DOUBLE); fields.put("maxPrice", Types.DOUBLE); fields.put("windowSize", Types.STRING); fields.put("views", Types.MAP(Types.STRING, Types.LONG)); fields.put("mostViewedBrand", Types.STRING); return Types.POJO(ProductFeature.class, fields); } } } ``` Next, go back to `ProductFeatureRollingWindow.processWindow` and pass `mostViewedBrand` on at line 75. ```java out.collect( new ProductFeature( ctx.getCurrentKey(), uniqueViewsCount, avgUniquePrice, uniqueViewsCount == 0 ? 0.0 : priceStats.getMin(), uniqueViewsCount == 0 ? 0.0 : priceStats.getMax(), windowIdentifier, productViewCounts, mostViewedBrand)); ``` ## 5. Get the feature in Redis The class `com.evoura.snowplow.operator.product.ProductMetricsGeneratorRichFunction` is responsible for converting the generated metrics into something the Redis operator understands. The method `ProductMetricsGeneratorRichFunction.processElement` has multiple feature collections. Add your new `mostViewedBrand` feature after the existing blocks: ```java out.collect( new MetricValue( String.format( "user:%s:most_viewed_brand_%s", productFeature.userId, productFeature.windowSize), String.valueOf(productFeature.mostViewedBrand))); ``` ## 6. Re-run the pipeline After these changes, re-run the script `./up.sh`. It will rebuild the docker image and restart the containers. You should be able to see in Redis the most viewed brand by the user. ![image.png](/assets/images/live-shopper-add-calculation-8c81055955e4a38fd50764ba080102ce.png) --- # Understand the live shopper feature calculations > Learn how rolling and session windows process Snowplow events in Flink to calculate product views, cart behavior, and session analytics metrics. > Source: https://docs.snowplow.io/tutorials/flink-live-shopper-features/calculations/ This section explains what is calculated, how each metric is updated, and where the values are stored. - Metrics are calculated in-stream, so they’re always fresh - Rolling windows offer an up-to-date view over the last N minutes or hours - Session windows account for user inactivity, producing clean session data - Redis keys are predictable, making metrics easy to use across services ## How features are calculated Everything starts with the entry point class `com.evoura.snowplow.SnowplowAnalyticsPipeline`, which is responsible for: - Creating a Kafka source (a Flink operator to read the data) - Parsing the data into a known object (`com.evoura.snowplow.model.SnowplowEvent`) - Branching the data for multiple windowed calculations - Defining how each window processes the data - Sinking the features into Redis At a high level, the workflow is: 1. The Snowplow tracker sends raw events to the Snowplow pipeline, which processes them and forwards them to Flink 2. Flink parses events, assigns more timestamps, and sends each event down a dedicated branch 3. Each branch runs a windowed function that maintains just enough state to update the metric 4. The function emits a `MetricValue` object at a fixed cadence (30 seconds or 1 minute) 5. Flink writes the metric to Redis using a predictable key (`user:{id}:feature:{name}_{window}`) ## Window types | Window type | Purpose | Size | Emit every | Ends when | Used in | | ----------- | ------------------------ | ----------------------- | ------------- | -------------------- | --------------------------------- | | Rolling | Continuous, sliding view | 5 m, 1 h, 24 h | 30 s or 1 min | Never; always shifts | Product, Category, Cart, Purchase | | Session | Group events per visit | Gap-based (30 min idle) | 30 s | No events for 30 min | Session metrics | This architecture diagram shows more details about the windowing logic: ![live-shopper-calculations-architecture.png](/assets/images/live-shopper-calculations-architecture-66d8b09fcc96a1e2544ae37edf3276d6.png) - All rolling windows use a custom `RollingWindowProcessFunction` - The session view uses `SnowplowSessionWindow` For example, aggregation on a rolling window of 5 seconds and a session window with a 3-second gap would look like: ![live-shopper-calculations-window.png](/assets/images/live-shopper-calculations-window-c2782b8c083ec80a69814b3d19049c7e.png) ![live-shopper-calculations-window2.png](/assets/images/live-shopper-calculations-window2-34d42ff53db3e07982b21ceb6f6a4ce2.png) More detail is available in [The case for a custom window in Flink: Expanding your streaming use-cases](https://pedromazala.substack.com/p/the-case-for-a-custom-window-in-flink?utm_source=snowplow\&utm_medium=accelerator\&utm_campaign=live-shopper) blog post. ## Per-feature logic ### Product views - Event filter: `product_view` - Key: `userId` - Windows: 5 min, 1 hour - Metrics: view count, average price, min-max price range _Emitted as_: `product_view_count_5m`, `avg_viewed_price_1h`, etc. ### Category engagement - Event filter: `list_view`, `product_view` - Key: `userId` - Windows: 5 min, 1 hour, 24 hour - Metrics: category view count, repeat views, top category in window ### Cart behavior - Event filter: `add_to_cart`, `remove_from_cart` - Key: `userId` - Windows: 5 min, 1 hour - Metrics: adds, removes, net cart value, cart change frequency ### Purchase history - Event filter: checkout events (`add_to_cart`, `remove_from_cart`, `checkout_step`) - Key: `userId` - Window: 24 hour - Metrics: order count, total spend, average order value ### Session analytics - Event filter: all high-level engagement events - Key: `sessionId` - Window: session gap (30 min) - Metrics: session duration, pages per session, bounce flag, cart-to-page ratio ## Redis key pattern ```text user:{user_id}:{feature}_{window} ``` Examples: - `user:trent@snowplowanalytics.com:product_view_count_5m` - `user:lucas@snowplowanalytics.com:session_duration` Downstream apps such as dashboards, ML models, or chat bots read these keys directly. --- # Conclusions from the Flink live shopper features accelerator > Build sub-second real-time personalization with Snowplow, Flink, Kafka, and Redis to drive higher conversion and smarter marketing. > Source: https://docs.snowplow.io/tutorials/flink-live-shopper-features/conclusion/ Real-time streams let you see what a shopper is doing while the session is still active. Events leave the browser, flow through Snowplow, and reach Flink within milliseconds. This is important because the system can trigger actions that improve conversion. Rolling and session windows turn raw events into metrics, and Redis makes each metric available to any service—chat bots, ML models, pricing engines, dashboards, etc.—all before the user moves on. ## Stack recap - **Snowplow tracker**: captures clean, strictly defined events - **Kafka**: buffers and scales traffic without spikes - **Flink**: windows and aggregates in memory for sub-second output - **Redis**: serves metrics with microsecond reads All components run in Docker, so you can spin up the full pipeline with a single script—no cloud bill required. ## What you built 1. Started the stack with `./up.sh` 2. Tracked product views, cart events, and purchases in the demo store 3. Watched events flow through **Kafka** (via **AKHQ**) and metrics appear in **Redis** (via **Redis Insights**) 4. Added your own calculation (most viewed brand) and published it live 5. Verified that low-latency metrics fed downstream apps in real time ## Business impact - **Higher conversion**: personalized offers and live chat triggers reach shoppers before they leave - **Smarter marketing**: real-time, segmented audiences feed email, ad, and push platforms with fresh intent data - **Data consistency**: a single source of truth powers both real-time and long-term analytics, reducing duplication and errors With these steps, you now have a repeatable blueprint for turning live behavioral data into immediate, actionable insight. --- # Learn how to calculate real-time shopper features using Apache Flink > Build real-time ecommerce analytics with Apache Flink to calculate shopper features like product views, cart behavior, and session metrics for in-session personalization. > Source: https://docs.snowplow.io/tutorials/flink-live-shopper-features/introduction/ Welcome to the **real-time shopper features using Apache Flink** solution accelerator for ecommerce. This accelerator demonstrates how to leverage Snowplow's behavioral data to monitor and act on shopper behavior while they're still navigating your site. Traditional analytics stacks focus on deriving insights after the fact—business intelligence dashboards that show what happened. Real-time monitoring enables proactive actions such as initiating a live chat, or sending a unique discount code based on surges in shopper activity. In this accelerator, you'll learn how to calculate key metrics, in near real time, that can power other systems. After you set up the project, you can explore the system via three tools: - [**Snowplow ecommerce store**](https://github.com/snowplow-industry-solutions/ecommerce-nextjs-example-store) (left panel in below image) - [**AKHQ**](https://akhq.io/), a Kafka visualization tool (center panel) - [**Redis Insights**](https://redis.io/insight/), a Redis visualization tool (right panel) All interactions on the store flow through [Snowplow Local](https://github.com/snowplow-incubator/snowplow-local) and are streamed into Kafka, where they are processed using Flink. ![Three panel screenshot showing ecommerce store and visualizations](/assets/images/live-shopper-introduction-467353b777bcfccfe7c51b1586822ec5.webp) The computed metrics can also feed longer-term dashboards, enhancing the quality and reusability of your data. ## Prerequisites This accelerator is fully Dockerized. The only prerequisites are Docker 28+ and Git. ## Solution accelerator code The code for this accelerator [is available on GitHub](https://github.com/snowplow-industry-solutions/flink-live-shopper). The project structure is: - `apps/` contains the Flink processing application (`flink/`) and the demo ecommerce store (`ecommerce/`) - `infra/` contains Docker Compose configurations for the infrastructure components: Snowplow, Kafka, Redis, and Localstack - `docker-compose.yaml` is the main Docker Compose file including all services - `up.sh` is a convenience script to initialize and start all services ## Key technologies - Snowplow: event tracking pipeline (Collector, Enrich, Kafka sink) - Apache Flink: stream processing engine for real-time analytics - Apache Kafka: message broker for decoupling event producers and consumers - Redis: in-memory data store for storing computed metrics - Next.js: framework for the demo ecommerce application - Docker and Docker Compose: containerization for easy setup and deployment of all services - AKHQ: web UI for Kafka management and inspection - Redis Insight: web UI for Redis data visualization and management - Grafana: visualization tool for monitoring and analyzing metrics ## Architecture This diagram shows the architectural overview: ![Architecture diagram](/assets/images/live-shopper-setup-architecture-6f92ebf8cbee9bd84ff993246c2c3924.svg) Benefits of this architecture: - Sub-second freshness: metrics are computed in the stream, not via nightly batch jobs, so they're actionable in-session - Single source of truth: the same logic powers dashboards and in-session nudges - Composable and portable: the entire system runs in Docker, and can be adapted to cloud-managed Kafka/Flink/Redis with minimal changes The key subsystems are described below. ### Event capture and ingestion with Snowplow - E-store front-end and Snowplow JavaScript tracker: user activity is captured as Snowplow ecommerce events - Snowplow Local to Kafka: the Snowplow pipeline validates the events, enriches them with device and geolocation data, then forwards them into Kafka ### Real-time stream processing in Flink - Source: a single Flink job reads from the `enriched-good` topic - Branching by event type: the stream is split into four logical lanes (product, category, cart, purchase) - There are two options for keying and windowing: - Rolling windows (5 min, 1 h, 24 h): keyed by `user_id` for always-fresh "last-N-minutes" stats - Session windows: keyed by `session_id`, grouping events into sessions that end after 30 minutes of inactivity - Aggregations: each lane computes its own features e.g. view counts, average price, cart value, session duration - Metric parsers: convert aggregated values into one or more metrics, e.g. a unique product count may feed both product view metrics and average viewed price metrics ### Feature store and action loop - Sink to Redis: Flink writes each metric to Redis using deterministic keys like `user:{id}:{feature}_{window}` or `session:{sid}:{metric}`, making Redis a low-latency feature store - Backend consumers: the ecommerce store back-end (or any downstream app like ML models or dashboards) can retrieve metrics in microseconds to: - Trigger live-chat prompts when high-value carts stall - Send discounts based on price sensitivity - Feed both real-time dashboards and long-term analytics using consistent definitions ## Acknowledgements Thank you to the Data Streaming experts [evoura](https://evoura.com/?utm_source=snowplow\&utm_medium=accelerator\&utm_campaign=live-shopper) for their support with building this accelerator. --- # Run the Flink live shopper features project > Start the Dockerized Flink streaming pipeline to process Snowplow events from Kafka and store computed shopper metrics in Redis. > Source: https://docs.snowplow.io/tutorials/flink-live-shopper-features/run/ 1. [Clone the repository](https://github.com/snowplow-industry-solutions/flink-live-shopper) 2. Run the startup script. This will: - Ensure `.env` files exist, copying from `.env.example` if needed - Update Git submodules - Start all the services defined in the Docker Compose files, in detached mode ```bash up.sh ``` 3. Access the main interfaces: - **Web application ecommerce store**: `http://localhost:3000` - **AKHQ (Kafka UI)**: `http://localhost:8085` - **Redis Insights**: `http://localhost:5540` - **Flink dashboard**: `http://localhost:8081` - **Grafana**: `http://localhost:3001` 4. When finished, stop the containers ```bash docker compose down ``` ## Dataflow steps 1. Client-side tracking: - The Next.js-based ecommerce store emits user interactions (e.g., `product_view`, `add_to_cart`) using the Snowplow JavaScript tracker 2. Ingestion pipeline: - Events are captured by the Collector and processed via Enrich - Snowbridge forwards enriched events into Kafka topics ![Screenshot showing the snowplow-enriched-good topic](/assets/images/live-shopper-setup-kafka-dda0bf0769ae3219845c29189fb0a2d9.png) 3. Stream processing (Apache Flink): - Events are parsed, filtered, and routed into logical branches - Processing includes: - Product features: count views, average viewed price, price range - Cart behavior: add/remove counts, cart value, update frequency - Category engagement: category views, repeat views - Purchase history: aggregate purchases over rolling windows (e.g., 24 hours) to calculate total spend, order count - Session analytics: duration, bounce rate, marketing source ![Screenshot showing the Flink dashboard](/assets/images/live-shopper-setup-flink-70d69fc83f917b8701fe190d5f1a936a.png) 4. Feature store (Redis): - Metrics are written to Redis using deterministic keys (e.g., `user:{user_id}:{feature}_{window}`) - These metrics are available for real-time lookups by downstream systems ![Screenshot showing the Redis dashboard](/assets/images/live-shopper-setup-redis-5c2074aa029d500e04c66c45443031f3.png) ## Testing To test the system, log in to the ecommerce store using one of the [mock users](https://github.com/snowplow-industry-solutions/ecommerce-nextjs-example-store/blob/main/src/mocks/users.ts). Open a product listed on the homepage. You should see data flowing to the `enriched-good` topic via [AKHQ](http://localhost:8085). This is the same data ingested by Flink to derive metrics. All calculated metrics will appear in [Redis Insights](http://localhost:5540). If you don't see any data flowing, check that the ecommerce store is sending events correctly. Ad blockers can interfere with event tracking: turn them off for full functionality. The data in Redis is now available to be consumed by any application, system, or process—for example, an AI agent can use this data to determine the next best action. --- # Conclusions and next steps from the Kafka live viewer profiles accelerator > Complete your real-time event-driven architecture for video streaming analytics with Snowplow, Kafka, and DynamoDB for live viewer insights. > Source: https://docs.snowplow.io/tutorials/kafka-live-viewer-profiles/conclusion/ In this tutorial, you've explored the **live viewer profiles** solution accelerator for video streaming, gaining practical insights into building, deploying, and extending real-time, event-driven architectures using Snowplow and Kafka. You have successfully built a real time system for processing event data including: - **Web tracking application** for collecting media events ![Application Output](/assets/images/video-4b9da5d6d44a98bf239574a7d56ebcf1.png) - **Snowplow Collector and Snowbridge** for event processing and forwarding - **Live Viewer back-end** for managing real-time data with Kafka and DynamoDB - **Live Viewer front-end** for visualizing real-time user activity on the web tracking application ![Live viewer frontend](/assets/images/live-viewer-df27c3494d3a88598b62d266d80769b6.png) This architecture highlights how real-time insights can be achieved using event-driven systems in a streaming context. ## What you achieved You explored how to: 1. Use LocalStack to emulate AWS services for local development and testing 2. Launch and interact with the system components, such as the Kafka UI and LocalStack UI 3. View and verify the real-time event data from the browser using Snowplow's media tracking capabilities 4. Deploy the solution in an AWS environment using Terraform This tutorial can be extended to utilize Snowplow event data for other real-time use cases, such as: - Web Engagement analytics - Personalized recommendations - Ad performance tracking ## Next steps - **Extend tracking:** extend the solution to track more granular user interactions or track on a new platform such as mobile - **Extend dashboard:** extend the Live Viewer to include information on the media being watched and the user - **Replace the state store:** replace Amazon DynamoDB with an alternative to be cloud agnostic, e.g. Google Bigtable or MongoDB By completing this tutorial, you're equipped to harness the power of event-driven systems and Snowplow's analytics framework to build dynamic, real-time solutions tailored to your streaming and analytics needs. --- # Deploy the Kafka live viewer profiles application on AWS using Terraform > Deploy the live viewer profiles solution to AWS infrastructure using Terraform scripts with EC2, DynamoDB, and Kafka components. > Source: https://docs.snowplow.io/tutorials/kafka-live-viewer-profiles/deploy-aws-terraform/ The following steps will deploy the solution accelerator to AWS using Terraform. This is an alternative to the Localstack method. There is no need to manually install [Terraform](https://www.terraform.io/). It is executed via [Docker](https://www.docker.com/) using the `terraform.sh` script. ## Step 0: Prerequisites 1. Open a terminal 2. Install **Docker** and **Docker Compose** 3. [Clone the project](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles) and navigate to its directory ```bash git clone https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles.git ``` 4. Create a `.env` file based on `./docker/.env.example` and configure AWS variables. ```bash ACCEPT_LICENSE="true" AWS_REGION=eu-west-2 AWS_ACCESS_KEY_ID=xxxxxxxxxxxxxxxxxxxxx AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxx ``` ## Step 1: Initialize the project ```bash $ ./terraform/terraform.sh init ``` ## Step 2: Create the infrastructure ```bash $ ./terraform/terraform.sh apply ``` ## Step 3: Access the EC2 instance that runs the apps in AWS ```bash $ ./terraform/apps/ssh.sh ``` Inside the EC2 instance, you can control the Docker images in a similar way to how you do locally: ```bash $ cd snowplow-demo $ ./stats.sh # <- show the statistics for the Docker containers $ ./down.sh # <- stop the Docker containers $ ./up.sh # <- start the Docker containers ``` ## Step 4: Open access to the applications Review the [LocalStack guide](/tutorials/kafka-live-viewer-profiles/quickstart-localstack/) for the default configuration for each component. Open public access to the two front-end applications and the Snowplow Collector using a HTTP load balancer so that anyone can watch the video, submit events to the pipeline, and see information on concurrent users. The applications listen for HTTP traffic on the following ports - Web tracker front-end - 3000 - Live viewer front-end - 8280 - Snowplow Collector - 9090 ## Next Steps - You can implement Snowplow media tracking on any [HTML5](/docs/sources/web-trackers/tracking-events/media/html5/) or [YouTube](/docs/sources/web-trackers/tracking-events/media/youtube/) media of your choice - Look into the output from Kafka and extend the Live Viewer to include information on the media being watched and the user. - Replace Amazon DynamoDB with an alternative to be cloud agnostic, e.g. Google Bigtable or MongoDB. *** ## Other commands ### Check versions ```bash $ ./terraform/terraform.sh --version ``` Example response ```bash Terraform v1.10.0 on linux_amd64 + provider registry.terraform.io/hashicorp/aws v5.79.0 + provider registry.terraform.io/hashicorp/local v2.5.2 + provider registry.terraform.io/hashicorp/tls v4.0.6 ``` ### Check the Terraform plan ```bash $ ./terraform/terraform.sh plan ``` ### Generate a PNG image for the Terraform modules in this project ```bash $ ./terraform/terraform.sh png ``` Current PNG image of the available modules: ![Terraform Modules](/assets/images/terraform-10e5f3377b0d5e1909fe44c79924f7a8.png) ### Destroy the infrastructure ```bash $ ./terraform/terraform.sh destroy ``` --- # Learn how to create live viewer profiles using Kafka > Build real-time viewer profiles for video streaming platforms using Snowplow, Kafka, Java, and DynamoDB to track live user interactions and engagement. > Source: https://docs.snowplow.io/tutorials/kafka-live-viewer-profiles/introduction/ Welcome to the **live viewer profiles** solution accelerator for video streaming. This accelerator demonstrates how to build a real-time use case leveraging **Snowplow event data** to create live viewer profiles for a video streaming site. By combining Snowplow's streaming pipeline with **Apache Kafka**, a **Java application** and **AWS DynamoDB**, the solution processes live streaming events to visualize user interactions with video content and advertisements. On the left side of the image below, someone is watching a video. Their events are sent through a Snowplow pipeline to Kafka where they are consumed and processed by an application. The result of this processing is displayed in the right window. This shows the number of active users and their current state. ![Application Output](/assets/images/one-viewer-2db2b92ead2e4e0cdce602d8d0c74d0f.png) Through this hands-on guide, you'll learn how to build, deploy, and extend real-time, event-driven architectures using Snowplow and Kafka, enabling personalized recommendations, real-time insights, and dynamic analytics for streaming platforms. The framework is inspired by common challenges in video streaming, including tracking user behavior, ad engagement, and session activities, with the goal of maintaining up-to-date viewer profiles in DynamoDB. This accelerator is open source and can serve as the foundation to build practical applications like real-time viewer insights, engagement analytics, ad performance tracking, and personalized recommendations. Whether you're optimizing ad placements or enhancing viewer satisfaction, this guide equips you to unlock the full potential of Snowplow event data. Please start by reviewing how the application works in the next page on Localstack, even if you're planning to deploy with Terraform. ## Solution Accelerator code The code for this infrastructure is available [here on GitHub](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles). ## Architecture The solution comprises several interconnected components: - **Web tracking application**: - A React application with a video to watch - Snowplow's media tracking has been configured to send events (e.g., play, pause, ad skipped) to the [Snowplow Collector](/docs/fundamentals/) - Code available in [tracker-frontend](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles/tree/main/tracker-frontend) folder in GitHub - **Snowplow Collector**: - Collects and forwards events via [Stream Enrich](/docs/fundamentals/) and Kinesis to [Snowbridge](/docs/api-reference/snowbridge/) - **Snowplow Snowbridge**: - Publishes events to Kafka for the Live Viewer back-end to consume - **Live Viewer back-end**: - A Java application which processes events from Kafka, stores the data in DynamoDB, and generates JSON state data for the Live Viewer front-end - Code available in [live-viewer-backend](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles/tree/main/live-viewer-backend) folder in GitHub - **Live Viewer front-end**: - A HTML website which displays the state of users currently watching the video - Code available in [live-viewer-frontend](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles/tree/main/live-viewer-frontend) folder in GitHub The following diagram maps out where each component sits in the end-to-end communication flow. ![Architecture Diagram](/assets/images/architecture-5be2ba8f8c4e631e43172a4a2d1e493a.png) ### Components and configuration The following files in the [GitHub repository](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles) can be used to configure the project's components: - **Snowplow components**: `docker/compose.snowplow.yaml` - **Kafka infrastructure**: `docker/compose.kafka.yaml` - **Application components**: `docker/compose.apps.yaml` - **LocalStack setup**: `docker/compose.localstack.yaml` - **AWS setup**: Terraform scripts (located in the `docs/terraform` folder) ## Acknowledgements Thank you to the Kafka experts [OSO](https://oso.sh/) for their support with building this accelerator. --- # Run the Kafka live viewer profiles application using LocalStack > Run the complete live viewer profiles stack locally with Docker, LocalStack, and Snowplow to process media tracking events and display real-time user states. > Source: https://docs.snowplow.io/tutorials/kafka-live-viewer-profiles/quickstart-localstack/ The following steps will deploy the solution accelerator using Localstack. ## Step 0: Prerequisites 1. Open a terminal 2. Install **Docker** and **Docker Compose** 3. [Clone the project](https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles) and navigate to its directory ```bash git clone https://github.com/snowplow-industry-solutions/kafka-live-viewer-profiles.git ``` 4. Create a `.env` file based on `./docker/.env.example` 5. You can leave the AWS variables as placeholders when using Localstack ```bash ACCEPT_LICENSE="true" AWS_REGION=eu-west-2 AWS_ACCESS_KEY_ID=xxxxxxxxxxxxxxxxxxxxx AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxx AWS_ENDPOINT_URL=http://localstack:4566 ``` ## Step 1: Start the containers Run the following command to download and run everything in Docker: ```bash ./docker/up.sh ``` Details on everything that's installed can be found in the [architecture](/tutorials/kafka-live-viewer-profiles/introduction/#architecture) section on the previous page. **Tips:** - Use `Ctrl+C` to stop services but keep containers running - Pass service-specific options to `./docker/up.sh` (e.g., `./docker/up.sh kafka-services`) ## Step 2: Open the web tracking front-end Visit <http://localhost:3000> to configure the Stream Collector endpoint and start tracking events. Enter the Collector URL: `localhost:9090` and click `Create tracker`. ![First page of tracking website](/assets/images/tracker-demo-721e93110218fc7e08aa453985b6d823.png) On the next screen, click `Custom media tracking demo`. This will bring up a video and a screen that displays information on what events are sent from the browser to the pipeline. If you want to simulate multiple users watching the video at the same time, you can open this in separate browsers. ![Welcome page on tracking website](/assets/images/welcome-page-aadae418d8f57badba9bb14166e6cf3e.png) You must keep this window open with the video playing because everything here is running in real-time. You can use the [Snowplow Chrome Plugin](https://chromewebstore.google.com/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm) to verify that the events are successfully sent from the web browser. ![Video playing on the website](/assets/images/video-4b9da5d6d44a98bf239574a7d56ebcf1.png) ## Step 3: Open the Live Viewer front-end Open <http://localhost:8280> in a separate window. This will display the active users and their current state (e.g. watching video, watching advertisement, paused). ![Live viewer frontend](/assets/images/live-viewer-df27c3494d3a88598b62d266d80769b6.png) Congratulations! You have successfully run the accelerator to stream web behavior through Snowplow and Kafka to a real-time dashboard. ## Next Steps - You can implement Snowplow media tracking on any [HTML5](/docs/sources/web-trackers/tracking-events/media/html5/) or [YouTube](/docs/sources/web-trackers/tracking-events/media/youtube/) media of your choice - Look into the output from Kafka and extend the Live Viewer to include information on the media being watched and the user. - Use our supplied Terraform in the next section to run this on AWS and make it publicly available. ## Other things you can do ### View Events in Kafka UI Access <http://localhost:8080> to review events within the Kafka UI. ### Manage containers with LazyDocker Run the following command to manage containers visually: ```bash sudo ./docker/lazy.sh ``` ### Inspect infrastructure with LocalStack UI Visit the [LocalStack UI](https://app.localstack.cloud/) to inspect infrastructure components such as Kinesis and DynamoDB. Please note that a LocalStack account is required to view this. ## Cleaning up ### Stop the containers Shut down all running containers: ```bash ./docker/down.sh ``` ### Clean up and delete To remove all containers and images, use: ```bash ./docker/clean.sh ``` This command will delete all generated data. --- # Conclusions and next steps from the Real-time Editorial Analytics accelerator > Complete your real-time event-driven architecture for editorial analytics with Snowplow and ClickHouse for real-time insights. > Source: https://docs.snowplow.io/tutorials/realtime-editorial-analytics-clickhouse/conclusion/ You've completed the real-time editorial analytics solution accelerator. You've gained practical experience building, deploying, and extending real-time, event-driven architectures using Snowplow and ClickHouse. You have successfully built a real-time system for processing event data including: - **Web tracking application** for collecting article interaction and ad performance events. - **Snowplow Micro and Snowbridge** for event processing and forwarding. You explored how to use Snowplow Micro to emulate a full Snowplow pipeline for local development and testing. - **ClickHouse** for processing and storing real-time event-level data. - **Editorial Analytics Dashboard front-end** for visualizing real-time content engagement behavior on the web tracking application. ![The Daily Query Analytics Dashboard showing Trending Articles, Trending Categories, and Ad Performance sections, each with metrics for the last 30 minutes including impressions, views, scroll depth, and engaged time](/assets/images/realtime-dashboard-f03b1b16e3bd3aed3b6f41938004109b.png) This architecture highlights how real-time insights can be achieved using event-driven systems in a streaming context. This accelerator can be extended to use Snowplow event data for other real-time use cases, such as: - Web engagement analytics - Personalized content recommendations - Ad performance tracking - Search and content re-ranking ## Next steps The following extensions are a good starting point for building on this accelerator: - **Extend tracking:** extend the solution to track more granular user interactions or track on a new platform such as mobile - **Personalize content recommendations:** extend the solution to change the Featured Article on the homepage based on the most-viewed article from the last 30 minutes - **Exclude content:** extend the solution to exclude any content already viewed by the current user - **Real-time content scoring:** extend the solution to compute a real-time score for each article, based on the latest engagement By completing this tutorial, you're equipped to build event-driven systems that deliver real-time analytics tailored to your streaming needs. --- # Learn how to perform real-time editorial analytics with Snowplow and ClickHouse > Perform real-time editorial analytics for media publishers using Snowplow and ClickHouse to track real-time article engagement and performance. > Source: https://docs.snowplow.io/tutorials/realtime-editorial-analytics-clickhouse/introduction/ Welcome to the **Real-time Editorial Analytics** solution accelerator for media publishers. This accelerator demonstrates how to leverage real-time **Snowplow event data** with **ClickHouse Cloud** to understand article engagement and user behavior on a media publisher site. The image below shows the accelerator in action. On the left side, a user is engaging with a media website, reading articles and clicking different advertisements. The website sends events through a local Snowplow Micro pipeline. The left side of the image shows the Snowplow events in the Snowplow Micro dashboard. ![Side-by-side view of The Daily Query demo website on the left and the Snowplow Micro dashboard on the right, showing captured article engagement events in a table](/assets/images/snowplow-tracking-website-3cc7cb851a7e762272b3a1221fbbdfbf.png) From here, events are forwarded to a ClickHouse table in near real-time latency. Each event is stored as an individual row in a single table, as seen on the left side of the image below. On the right is an example dashboard, hosted as part of the accelerator website, which queries ClickHouse for real-time article engagement and ad performance metrics from the previous 30 minutes. ![Side-by-side view of the ClickHouse SQL console showing the snowplow\_article\_interactions table on the left, and The Daily Query Analytics Dashboard showing trending articles and ad performance metrics on the right](/assets/images/editorial-analytics-dashboard-71fa110b1be2c2aebc6c99e5ba3ce720.png) Through this hands-on guide, you'll learn how to build, deploy, and extend real-time, event-driven architectures using Snowplow and ClickHouse. The framework is inspired by real customer use cases in media. This accelerator is open source (Apache 2.0). Feel free to use it as the foundation for practical applications such as real-time viewer insights, engagement analytics, ad performance tracking, or personalized content recommendations. ## Solution accelerator code The code for this infrastructure is available [here on GitHub](https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics/tree/main). ## Architecture The solution comprises several interconnected components: - **Web tracking application**: - A Next.js application with a number of articles and advertisements - Snowplow tracking for events related to article engagement, e.g., article impressions, article views, ad impressions, ad clicks, or page pings, sent to the [Snowplow Collector](/docs/fundamentals/) - Code available in the [`snowtype.ts`](https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics/blob/main/website/snowtype/snowplow.ts) file in GitHub - **Snowplow Micro**: - [Snowplow Micro](/docs/testing/snowplow-micro/) is a lightweight version of the Snowplow pipeline which can be run locally - It passes validated and enriched events to [Snowbridge](/docs/api-reference/snowbridge/) - **Snowplow Snowbridge (also known as [Event Forwarding](/docs/destinations/forwarding-events/))**: - Filters incoming Snowplow events to only forward a subset of events and dimensions to ClickHouse - Publishes events and lands events in a single table in ClickHouse using ClickHouse's [HTTP interface](https://clickhouse.com/docs/interfaces/http) - **ClickHouse Cloud**: - [ClickHouse Cloud](https://clickhouse.com/) receives and stores events from Snowplow - Stored data can be queried using ClickHouse's UI or via API The following diagram maps out where each component sits in the end-to-end communication flow. ![Architecture diagram showing event data flowing from a front-end website through Snowplow Micro (Collector, Schema Validation, Enrichments) and Snowbridge (Event Forwarding, Transformation and Filters) to a ClickHouse table, which is queried by a Personalization Service and the Content Editorial Team](/assets/images/architecture-eb542ecb2ae024c72d67248fcfe5892e.png) ## Prerequisites You'll need a ClickHouse Cloud account to receive the Snowplow events. A [30-day free trial](https://clickhouse.com/cloud) signup is available. ## Acknowledgments Thank you to the [ClickHouse](https://clickhouse.com/) team for their support and collaboration with building this accelerator. --- # Run the application using Snowplow Micro and ClickHouse > Run the complete real-time editorial analytics stack locally with Docker, Snowplow and ClickHouse to process user behavioral events and display real-time content engagement insights and ad performance metrics. > Source: https://docs.snowplow.io/tutorials/realtime-editorial-analytics-clickhouse/quickstart/ The following steps will deploy the solution accelerator using Docker. ## Step 0: Prerequisites 1. Open a terminal 2. Install **Docker** and **Docker Compose**. You can run the following commands to check if it's already installed. If not installed, you can install Docker / Docker Compose by following these [instructions](https://docs.docker.com/compose/install/). ```bash docker --version docker-compose --version ``` 3. [Clone the project](https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics) and navigate to its directory ```bash git clone https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics.git ``` 4. Run the SQL query in `./clickhouse-queries/create-table-query.sql` within ClickHouse's SQL console. This table will store the Snowplow events. You'll need a `CLICKHOUSE_DATABASE` value in the next step. It's labeled in this image: ![ClickHouse Cloud SQL console home screen with the left navigation showing the Snowplow database and its tables, with the CLICKHOUSE\_DATABASE value highlighted in the breadcrumb](/assets/images/clickhouse-database-name-2a822d0a496e1fea4f1213efcc53f5d2.png) 5. Create a `.env` file by copying `.env.example`. To populate the values, go to your ClickHouse account and select the [HTTPS connection](https://clickhouse.com/docs/getting-started/quick-start/cloud#connect-with-your-app) method. This will display a sample `curl` command containing your connection details. Map each value to the correct environment variable: | Variable | Where to find it | | --------------------- | --------------------------------------------------------------------------------------------------------- | | `CLICKHOUSE_HOST` | Line 3 of the sample `curl` command. Format: `https://<your-clickhouse-host>.aws.clickhouse.cloud:<port>` | | `CLICKHOUSE_USER` | The **Username** field | | `CLICKHOUSE_KEY` | The **Password** field | | `CLICKHOUSE_DATABASE` | The database name from the SQL Console where the SQL query was run. This is typically `default` | | `CLICKHOUSE_TABLE` | Always set this to `snowplow_article_interactions` | ![ClickHouse Connect dialog with Username and Password fields annotated as CLICKHOUSE\_USER and CLICKHOUSE\_KEY, and a curl command with line 3 annotated as CLICKHOUSE\_HOST](/assets/images/clickhouse-credentials-2d4755d109c7b620538c6c1748f3a809.png) ## Step 1: Start the containers Run the following command to download and run everything in Docker: ```bash docker-compose up -d ``` The [architecture](/tutorials/realtime-editorial-analytics-clickhouse/introduction/#architecture) section on the previous page has the details on everything that's installed. ## Step 2: Open the web tracking front-end Wait for about 30 seconds for the website container to start. Once it's ready, visit [`http://localhost:3000`](http://localhost:3000) to view the website application and start tracking events. ![The Daily Query demo website homepage showing a featured article about AI in journalism and a grid of new articles below](/assets/images/homepage-433833d669609a13cef1c2ba91ea3dd9.png) 2.1 Click on any of the articles that are on the homepage. Scroll down on the new page which opens. Wait for about 10 seconds to simulate a user reading. 2.2 Click on the advertisement which appears on the right-hand sidebar. Return to the homepage by clicking the **The Daily Query** logo in the header. ![The Daily Query demo website article page with a Professional Development Courses advertisement in the right sidebar](/assets/images/advertisement-ee4a00ab1bd585c370c43c86ddeecd03.png) 2.3 Select a different article from the homepage. Scroll down on the new page which opens. Wait for about 10 seconds to simulate a user reading. 2.4 Click on the advertisement which appears on the right-hand sidebar as you did in Step 2.2. Return to the homepage by clicking the **The Daily Query** logo in the header. ## Step 3: Open the Snowplow Micro front-end Open Snowplow Micro on [`http://localhost:9090/micro/ui`](http://localhost:9090/micro/ui) in a separate window. Press the **Refresh** button located in the header. This will display the current Snowplow events which are being tracked (e.g. `page_view`, `page_ping`, `article_interaction`, `ad_interaction`). You can use the **Pick Columns** button to select certain dimensions. Try selecting the following: - `event_name` from the "Events" section - `com_demo_ad_interaction_1.type` from the "Events" section - `com_demo_media_article_interaction_1.type` from the "Events" section - `com_demo_media_article_1.title` from the "Entities" section ## Step 4: Query the data in ClickHouse Console Run the following query in ClickHouse's SQL Console. You should see events landing in real-time within the ClickHouse table. ```sql select * from snowplow_article_interactions order by dvce_created_tstamp desc ``` ![ClickHouse SQL console showing query results from the snowplow\_article\_interactions table, with rows of event data ordered by creation timestamp](/assets/images/clickhouse-results-489eb6997e47976b047ec59398d787ae.png) ## Step 5: View the editorial analytics data in a sample real-time dashboard Visit the real-time editorial analytics dashboard at [`localhost:3000/dashboard`](http://localhost:3000/dashboard), which is querying data from ClickHouse. Press the **Load Data** button to see the article engagement and ad performance metrics for the last 30 minutes. ![The Daily Query Analytics Dashboard showing Trending Articles, Trending Categories, and Ad Performance sections, each with metrics for the last 30 minutes including impressions, views, scroll depth, and engaged time](/assets/images/realtime-dashboard-f03b1b16e3bd3aed3b6f41938004109b.png) If you're interested in the queries powering these insights, take a look at the code here: - [Trending Articles report](https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics/blob/main/website/app/api/dashboard/route.ts#L52) - [Trending Categories report](https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics/blob/main/website/app/api/dashboard/route.ts#L122) - [Ad Performance](https://github.com/snowplow-industry-solutions/clickhouse-realtime-editorial-analytics/blob/main/website/app/api/dashboard/route.ts#L204) ## Step 6: Generate more insights Try selecting different news articles, or clicking on different displayed ads. Repeat Step 4 or Step 5 and the data will refresh in real-time. ## Clean up and delete Shut down and delete all running containers: ```bash docker-compose down ``` > **Tip:** There will still be data in your ClickHouse Cloud account. If you want to delete the generated data, run the following command in ClickHouse's SQL Console: > > ```sql > DROP TABLE snowplow_article_interactions > ``` --- # Conclusions and next steps from the Signals batch engine tutorial > Complete the Snowplow Signals batch engine tutorial for warehouse-based attribute calculation. Learn how to use dbt projects to compute and sync historical attributes to the Profiles Store. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/conclusion/ In this tutorial you've learned how to calculate attributes from your warehouse data, and apply them to Signals. This is the process workflow: - Define batch attribute group configurations and apply them to Signals - Initialize dbt projects - Generate models - Configure the projects with dbt - Create tables and snapshot by running dbt - Connect the snapshot to Signals by syncing Supported warehouses: - Snowflake - BigQuery ## Next steps Learn more about the Signals possibilities in our other Signals tutorials: - [Signals Quick Start](/tutorials/signals-quickstart/start/) - [Score prospects in real time using Signals and ML](/tutorials/signals-ml-prospect-scoring/intro/) --- # Generate dbt data models using the batch engine CLI > Generate dbt SQL models for Snowplow Signals batch attribute processing including base models, filtered events, daily aggregates, and final attribute definitions. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/generate-models/ Each project will have its own set of models generated based on its specific schema and requirements. For each project, the generation process will: 1. Create dbt configuration files 2. Generate SQL models based on the batch attribute group's schema 3. Set up necessary macros and functions 4. Update any existing files if needed For each batch attribute group, the generated models are specifically designed for batch processing: - Base models: raw data transformations - Filtered events: event filtering and cleaning - Daily aggregates: time-based aggregations - Attributes: final feature definitions ## Run generate command Depending on how you initialized your projects, you can generate models in two ways. If you created projects for all attribute groups, you can generate models for all of them at once: ```bash # For all attribute groups snowplow-batch-engine generate --verbose ``` To generate models for a specific project: ```bash snowplow-batch-engine generate \ --project-name "user_attributes_1" \ --target-type snowflake \ --verbose ``` Adjust the target-type to `bigquery`, if relevant. Remember that project names follow the format `{attribute_group_name}_{attribute_group_version}`. ## Project structure After generation, each project in your repository will have this expanded structure: ```text my_snowplow_repo/ ├── user_attributes_1/ │ ├── dbt_project.yml # Main dbt configuration │ ├── packages.yml # Dependencies configuration │ ├── models/ # SQL models │ │ ├── base/ # Base models │ │ ├── filtered_events/ # Event filtering │ │ ├── daily_aggregates/ # Aggregated data │ │ └── attributes/ # Feature definitions │ ├── configs/ # Configuration files │ │ ├── base_config.json │ │ └── dbt_config.json │ │ └── batch_source_config.json │ └── macros/ # Reusable SQL functions ├── product_attribute_groups_2/ │ └── ... (same structure) └── user_segments_1/ └── ... (same structure) ``` ## Troubleshooting If you encounter any issues during generation: 1. Check that your projects were properly initialized in the correct path 2. Review the `base_config.json` file in each project for configuration issues 3. Check that your API credentials have the necessary permissions 4. Use the `--verbose` flag for more detailed error messages --- # Create attribute group projects using the batch engine CLI > Initialize separate dbt projects for each Snowplow Signals batch attribute group with automated folder structure and base configuration files. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/initialize-project/ Having tested the connection, you can now initialize your projects. When you run the initialization command, the CLI will: 1. Create a separate project directory for each relevant attribute group 2. Set up the basic configuration files for each project 3. Initialize the necessary folder structure for each project 4. Prepare each project for model generation ## Run initialize You can generate projects for all the relevant attribute groups in Signals at once, or one at a time. Change your target-type to `bigquery` if relevant. ```bash # For all attribute groups snowplow-batch-engine init \ --target-type snowflake \ --verbose # For a specific attribute group snowplow-batch-engine init \ --attribute-group-name "user_attributes" \ --attribute-group-version 1 \ --target-type snowflake \ --verbose ``` Each attribute group will have its own separate dbt project, with the project name following the format `{attribute_group_name}_{attribute_group_version}`. The files will be generated at the path specified in your `SNOWPLOW_REPO_PATH` environment variable. ## Project structure After initialization, your repository will have a structure like this: ```text my_repo/ ├── my_attribute_group_1/ │ └── configs/ │ └── base_config.json ├── etc. ``` In this example, projects were generated for three attribute groups: `user_attributes` v1, `product_attribute_groups` v2, and `user_segments` v3: ```text my_snowplow_repo/ ├── user_attributes_1/ │ └── configs/ │ └── base_config.json ├── product_attribute_groups_2/ │ └── configs/ │ └── base_config.json └── user_segments_1/ └── configs/ └── base_config.json ``` ## Troubleshooting If you run into any issues during initialization: 1. Check that you have write permissions in the target directory 2. Check that you don't already have a project with the same name as one you're trying to initialize 3. Check that your API credentials have the necessary permissions 4. Use the `--verbose` flag to get more detailed error messages --- # Install the Signals Python SDK with batch engine CLI > Install the Snowplow Signals batch engine CLI tool with Python pip, and configure environment variables for API authentication and warehouse attribute processing. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/install/ Choose where to generate your Signals dbt models. We recommend creating a new repository. Navigate into your repo, and check you're in the intended Python environment. The batch engine is part of the Signals Python SDK. It's not installed by default, as not all use cases will need it. To install it, run the following command: ```bash pip install 'snowplow-signals[batch-engine]' ``` This will install the CLI tool as `snowplow-batch-engine`, along with the necessary dependencies. ## Available commands The available options are: ```bash init # Initialize dbt project structure and base configuration generate # Generate dbt project assets sync # Registers the attribute table as a data source with Signals test_connection # Test the connection to the authentication and API services ``` A `--verbose` flag is available for every command. ## Set up environment variables To make your workflow smoother, set up your Signals credentials as environment variables. This way, you won't need to type them in every command: ```bash export SNOWPLOW_API_URL="YOUR_API_URL" export SNOWPLOW_API_KEY="YOUR_API_KEY" export SNOWPLOW_API_KEY_ID="YOUR_API_KEY_ID" export SNOWPLOW_ORG_ID="YOUR_ORG_ID" export SNOWPLOW_REPO_PATH="./my_snowplow_repo" # Path to the repo where you want to generate projects ``` The variables must have these exact names. --- # Run and test your new attribute dbt models > Configure dbt connection profiles, run models with full refresh to create attribute tables in Snowflake or BigQuery, and validate data quality. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/run-dbt/ Now that your models are generated, it's time to run them and verify that everything works as expected. This step allows you to test your models locally before moving them to production. During the run process: - dbt will compile your SQL models - Tables will be created in your data warehouse - You'll see progress updates in the terminal - Any errors will be clearly displayed Best practice to ensure successful model runs: - Always test your models after generation - Review the generated SQL for accuracy - Document any custom modifications you make - Keep track of model versions - Regularly update your models as your data evolves ## Configure dbt Before running your new models, you'll need to configure their dbt connection profile. Read more about this in the [dbt documentation](https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles). The batch engine doesn't generate a `profiles.yml` because it isn't best practice to store credentials in the same place as models. ## Run the models After configuring dbt, you can run your models locally using the `dbt run` command. For your first run, you'll want to do a full refresh to ensure all tables are created properly: ```bash dbt run --full-refresh ``` ![dbt first run](/assets/images/dbt_first_run-22571b35554de4e00d266894a4285914.png) For later runs, you can use the standard command: ```bash dbt run ``` Running the models will create tables of your newly calculated attributes. ## Test and validate It's important to test your models before moving them to production. The local testing phase is your opportunity to: - Verify that the generated models meet your requirements - Make any necessary adjustments to the models - Explore the data transformations - Ensure data quality and accuracy Follow your standard dbt testing process. ## Make sure all your data is processed The first time the model is run, not all your data might get processed. Processing is dictated by the variable `snowplow__backfill_limit_days`. You could increase this to a larger number depending on the data volume and how much data you need to backfill. You might be able to process everything in one run, or after a few runs. ## Run dbt snapshot Once your data is fully backfilled, you are ready to run `dbt snapshot`. The attributes table captures the latest values for each attribute key in a drop and recompute fashion. Once you start running the dbt snapshot linked to this table, the sync engine will only process changes to specific attribute keys. It will only sync attribute keys where at least one attribute value changed since the last time data was sent to the profile store. This optimized syncing saves processing cost. Once you start syncing, you'll need to incorporate dbt snapshot runs after each time you run your dbt models. More on this in the next step. --- # Learn how to set up the Signals batch engine for warehouse attributes > Set up the Snowplow Signals batch engine to calculate historical behavioral data attributes from warehouse data using dbt. Generate modeled datasets and sync attributes to the Profiles Store for real-time access. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/start/ Welcome to the [Snowplow Signals](/docs/signals/introduction/) batch engine tutorial. Snowplow Signals is a real-time personalization engine for customer intelligence, built on Snowplow's behavioral data pipeline. It allows you to compute, access, and act on in-session stream and historical user data, in real time. The Signals batch engine is a CLI tool to create the attributes in your warehouse to compute over larger historical data that otherwise would not be possible / efficient to do so in real time. It isn't required to use Signals: it's only necessary if you want to: - Calculate attributes from Snowplow events in your warehouse - Sync those attributes to the Profiles Store so they can be served in real time alongside stream attributes The batch engine helps by: - Generating separate dbt projects for each batch attribute group definition - Building efficient modeled datasets at different aggregation levels, instead of querying directly against large atomic event tables - Producing attributes tables optimized for downstream use - Syncing the calculated attributes to Signals, making them available for production use To use tables of pre-existing, already calculated values, read up on external batch sources in the [Signals documentation](/docs/signals/concepts/). This guide will walk you through the steps to set up the batch engine and calculate attributes. ## Prerequisites This tutorial assumes that you have: - Python 3.11+ installed in your environment - Snowflake or BigQuery warehouse with your atomic Snowplow events ready to use as the data source - [dbt](https://www.getdbt.com/) with your warehouse [target](https://docs.getdbt.com/reference/dbt-jinja-functions/target) set up - Basic [dbt](https://www.getdbt.com/) knowledge - Valid API credentials for your Signals account: - Signals API URL - Snowplow API key - Snowplow API key ID - Snowplow organization ID - Batch attribute groups already created for Signals, but not yet published The batch source configuration can't be done before the attributes table has been created. Check out the [Signals configuration](/docs/signals/introduction/) documentation to find out where to find these credentials, and how to apply attribute configurations. --- # Sync the attribute tables with the Signals Profiles Store > Register warehouse attribute tables as batch sources in Signals to enable hourly syncing to the Profiles Store for real-time access in applications. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/sync-tables/ Syncing is the process of making your calculated attributes available in Signals for production use. After the attributes table is ready for production use, you should run `dbt snapshot`. The sync engine will use this table to understand which records have been changed since the last sync. In order for it to work you need to let the sync engine know about the location of your dbt snapshot. There are two steps to enable syncing: 1. Fill out the `batch_source_config.json` file for each dbt project (one project per attribute group) 2. Run the `sync` command ## Update configuration file During data model generation, a config file is generated in `config/batch_source_config.json`. It will have a similar structure to this: ```yml { "database": "", # Add your database name "wh_schema": "", # Add your schema "table": "user_attributes_1_attributes_snapshot", "name": "user_attributes_1_attributes", "timestamp_field": "dbt_valid_from", "description": "Table containing attributes for user_attributes_1 attribute group", "tags": {}, "owner": "" } ``` Fill out the `database` (for BigQuery this should be the project) and `wh_schema` values as per your [dbt target](https://docs.getdbt.com/reference/dbt-jinja-functions/target) setup. The warehouse schema should be the `schema` defined in your dbt target, suffixed with `_derived` (`{target_schema}_derived`). This is where the generated attributes table and snapshot are located by default. ## Run the sync command Adjust the following CLI command according your use case (e.g. Change the target type to `bigquery`) and run it: ```bash snowplow-batch-engine sync \ --attribute-group-name "user_attributes" \ --attribute-group-version 1 \ --target-type snowflake \ --verbose ``` The batch engine will first register the batch source for the attribute group. It will also publish the attribute group so that syncing can begin. Signals will check for updates to the table every hour. Changes will be captured in the next sync if both the dbt run and the dbt snapshot update have finished. Your attributes will then become available to retrieve in your applications. ```bash # Progress messages ✅ Successfully added Batch Source information to attribute group user_attributes_1 ✅ Successfully registered table user_attributes_1_attributes ``` --- # Test the Signals batch engine connections > Verify connectivity to Snowplow Signals API and check database, cache, and storage service health before initializing dbt projects. > Source: https://docs.snowplow.io/tutorials/signals-batch-engine/test-connection/ The first step is to confirm that you can connect to all the necessary services. The connection test checks several important components: - Verifies your API credentials - Ensures the main Signals API service is accessible - Checks the status of: - Database connections - Cache service - Storage systems Test your connection using the following command: ```bash snowplow-batch-engine test-connection --verbose ``` If you didn't set up environment variables, you can also provide the credentials as command-line flags: ```bash snowplow-batch-engine test-connection \ --api-url "YOUR_API_URL" \ --api-key "YOUR_API_KEY" \ --api-key-id "YOUR_API_KEY_ID" \ --org-id "YOUR_ORG_ID" \ --verbose ``` When everything is working correctly, you'll see a clear success message: ```bash 🔐 Testing authentication service... ✅ Authentication service is healthy 🌐 Testing API service... ✅ API service is healthy 📊 Dependencies status: ✅ database: ok ✅ cache: ok ✅ storage: ok ✨ All services are operational! ``` You can continue to the next step. ## Troubleshooting If you encounter any problems: 1. Double-check your API credentials 2. Verify your network connection 3. Ensure your API key has the required permissions 4. Use the `--verbose` flag for detailed error messages 5. Check if your organization's services are up and running --- # Conclusions and next steps for the Signals interventions tutorial > Complete the Signals Sandbox tutorial and explore next steps for real-time personalization. > Source: https://docs.snowplow.io/tutorials/signals-interventions/conclusion/ Congratulations! You've successfully completed the Signals Sandbox tutorial and experienced real-time personalization in action. In this tutorial, you: - Deployed a Signals Sandbox instance - Used the Signals Python SDK to programmatically define attributes - Created a service to expose calculated attributes - Defined rule-based interventions for common ecommerce scenarios - Tested your configuration with an interactive demo application - Saw real-time personalization triggered by actual user behavior You've worked with the core Signals concepts: - **[Attributes](/docs/signals/attributes/attributes/)**: real-time calculations of user behavior patterns - **[Attribute groups](/docs/signals/attributes/attribute-groups/)**: organized collections of related attributes - **[Services](/docs/signals/attributes/services/)**: interfaces for applications to retrieve attributes - **[Interventions](/docs/signals/interventions/)**: rules that trigger personalized experiences Thank you for trying Signals. We hope this tutorial has inspired ideas for how you can use real-time behavioral data to create personalized experiences for your users. ## Next steps Now that you understand how Signals works, here are some ways to continue your journey: ### Explore more Signals capabilities - Try defining attributes with different [aggregations](/docs/signals/attributes/attributes/#aggregation-options) (min, max, average, etc.) - Experiment with [time windows](/docs/signals/attributes/attributes/#time-period) for attributes - Create more complex intervention criteria using multiple attributes - Define attributes based on different [event types](/docs/fundamentals/events/) ### Try other tutorials These tutorials require a [Snowplow account](/docs/get-started/private-managed-cloud/) with Signals configured: - [Set up Signals for real-time calculation](/tutorials/signals-quickstart/start/) using the Snowplow Console UI - [Score prospects in real time using Signals and ML](/tutorials/signals-ml-prospect-scoring/intro/) for machine learning integration - [Set up the Signals batch engine](/tutorials/signals-batch-engine/start/) for warehouse-based attribute calculation ### Move to production Ready to use Signals in production? You'll need: - A [Snowplow](/docs/get-started/private-managed-cloud/) account - A [Signals connection](/docs/signals/connection/) configured - Integration of the [browser tracker plugin](/docs/signals/interventions/subscribe/#using-the-browser-tracker-plugin) or API calls in your application ### Learn more - Read the full [Signals documentation](/docs/signals/introduction/) - Explore the [Signals Python SDK reference](https://pypi.org/project/snowplow-signals/) - Join the [Snowplow community](https://community.snowplow.io/) to discuss use cases and best practices ## Sandbox limitations Remember that the Signals Sandbox is designed for exploration and learning: - Sandbox instances are temporary and will be deleted after some period - Data is not persisted long-term - Performance and rate limits apply - For production use cases, use Snowplow CDI with Signals --- # Define attributes and service configuration using the Signals Python SDK > Use the Signals Python SDK to programmatically define real-time ecommerce attributes. > Source: https://docs.snowplow.io/tutorials/signals-interventions/define-attributes-python/ In this section, you'll define [attributes](/docs/signals/concepts/#attribute-groups) that calculate real-time user behavior metrics from ecommerce events. These attributes will track product views, cart additions, and cart value. You'll create three attributes to track user shopping behavior: - **`count_product_views`** counts the number of product view events - **`count_add_to_cart`** counts the number of add-to-cart events - **`total_cart_value`** sums the prices of items added to the cart These attributes use the [Snowplow ecommerce plugin](/docs/sources/web-trackers/tracking-events/ecommerce/), which tracks standardized ecommerce events. ## Import required classes Start by importing the classes you'll need from the [Signals Python SDK](https://pypi.org/project/snowplow-signals/): ```python from snowplow_signals import ( Attribute, Event, Criterion, Criteria, EventProperty, EntityProperty, StreamAttributeGroup, Service, domain_userid, ) ``` ## Define the attributes Each [attribute](/docs/signals/attributes/using-python-sdk/attribute-groups/attributes/) defines which event it will be calculated from, and what kind of aggregation will be performed. ### Product views counter This attribute counts product view events: ```python count_product_views = Attribute( name="count_product_views", type="int32", events=[Event(name="snowplow_ecommerce_action")], criteria=Criteria( all=[ Criterion.eq( property=EventProperty( vendor="com.snowplowanalytics.snowplow.ecommerce", name="snowplow_ecommerce_action", major_version=1, path="type", ), value="product_view", ) ] ), aggregation="counter", ) ``` This attribute: - Uses the `snowplow_ecommerce_action` [event](/docs/fundamentals/events/) - Filters for events where the action type is `product_view` - Counts matching events using the `counter` aggregation ### Add to cart counter This attribute counts when users add items to their cart: ```python count_add_to_cart = Attribute( name="count_add_to_cart", type="int32", events=[Event(name="snowplow_ecommerce_action")], criteria=Criteria( all=[ Criterion.eq( property=EventProperty( vendor="com.snowplowanalytics.snowplow.ecommerce", name="snowplow_ecommerce_action", major_version=1, path="type", ), value="add_to_cart", ) ] ), aggregation="counter", ) ``` ### Total cart value This attribute sums the prices of products added to the cart: ```python total_cart_value = Attribute( name="total_cart_value", type="double", events=[Event(name="snowplow_ecommerce_action")], criteria=Criteria( all=[ Criterion.eq( property=EventProperty( vendor="com.snowplowanalytics.snowplow.ecommerce", name="snowplow_ecommerce_action", major_version=1, path="type", ), value="add_to_cart", ) ] ), property=EntityProperty( vendor="com.snowplowanalytics.snowplow.ecommerce", name="product", major_version=1, path="price", ), aggregation="sum", ) ``` This attribute: - Filters for `add_to_cart` events - Extracts the `price` field from the `product` [entity](/docs/fundamentals/entities/) - Sums the prices using the `sum` aggregation ## Create an attribute group [Attribute groups](/docs/signals/concepts/#attribute-groups) organize related attributes together. They can be considered as "tables" of attributes. Create a group to hold your ecommerce attributes: ```python attribute_group = StreamAttributeGroup( name="ecom_attributes", version=1, attribute_key=domain_userid, attributes=[ count_product_views, count_add_to_cart, total_cart_value, ], owner="user@company.com", ) ``` The `attribute_key` parameter specifies the [user identifier](/docs/signals/concepts/#attribute-keys) that the attributes are grouped by. In this case, `domain_userid` means the attributes track behavior for each anonymous user. This is a `StreamAttributeGroup`, because Signals will process events from the real-time event stream. > **Note:** Attribute groups are immutable and versioned. If you need to modify attributes, create a new version of the group. ## Create a service [Services](/docs/signals/concepts/#services) provide an interface for applications to retrieve attributes. Create a service that includes your attribute group: ```python stream_service = Service( name="ecom_attributes", attribute_groups=[attribute_group], owner="user@company.com", ) ``` ## Publish to Signals Now publish both the attribute group and service to your Signals instance: ```python sp_signals.publish([attribute_group, stream_service]) ``` Once published, Signals begins calculating these attributes in real time as events arrive at the Collector. > **Tip:** You can retrieve attribute values for a specific user by calling: > > ```python > stream_service.get_attributes( > signals=sp_signals, > attribute_key="domain_userid", > identifier="your-domain-userid-here", # the value will be a UUID > ) > ``` Your attributes are now live and ready to power personalization experiences. Next, you'll define interventions that trigger based on these attribute values. --- # Define real-time ecommerce intervention triggers > Create rule-based interventions that trigger personalized experiences based on user behavior. > Source: https://docs.snowplow.io/tutorials/signals-interventions/define-interventions/ [Interventions](/docs/signals/concepts/#interventions) are rules that trigger when user attributes meet specific conditions. They enable real-time personalization by activating experiences based on current behavior. When an intervention triggers: 1. Signals evaluates attribute values in real time 2. If the criteria are met, the intervention activates 3. Your Web application can receive intervention notifications via the [browser tracker plugin](/docs/signals/interventions/subscribe/#using-the-browser-tracker-plugin) directly on the frontend - This uses a Server-Sent Events (SSE) connection for real-time streaming In this section, you'll create three interventions for common ecommerce scenarios: cart abandonment, discount offers for engaged browsers, and free shipping promotions. ## Import intervention classes Add the intervention-related imports to your notebook: ```python from snowplow_signals import ( RuleIntervention, InterventionCriterion, LinkAttributeKey, ) ``` ## Define intervention rules These [rule-based interventions](/docs/signals/interventions/using-python-sdk/) will be triggered automatically when their criteria are met. Each `InterventionCriterion` specifies: - `attribute`: the attribute to evaluate, in the format `group_name:attribute_name` - `operator`: the comparison operator (`>`, `<`, `>=`, `<=`, `==`, `!=`) - `value`: the threshold value for comparison ### Cart abandonment intervention This intervention triggers when a user has added items to their cart but hasn't completed checkout: ```python cart_abandonment_intervention = RuleIntervention( name="cart_abandonment", description="Show banner to users who have added at least one item to cart.", criteria=InterventionCriterion( attribute="ecom_attributes:count_add_to_cart", operator=">", value=0, ), target_attribute_keys=[ LinkAttributeKey(name="user_id"), LinkAttributeKey(name="domain_userid"), ], owner="user@company.com", ) ``` Key components: - **`criteria`**: triggers when `count_add_to_cart > 0`. - **`target_attribute_keys`**: specifies which user identifiers to use when matching interventions. This intervention targets both the authenticated `user_id` and anonymous `domain_userid` identifiers, to reach as many users as possible. ### Discount intervention This intervention offers a discount to users who are actively browsing products: ```python discount_intervention = RuleIntervention( name="discount", description="Show banner to users who have viewed more than 3 products.", criteria=InterventionCriterion( attribute="ecom_attributes:count_product_views", operator=">", value=3, ), target_attribute_keys=[ LinkAttributeKey(name="user_id"), LinkAttributeKey(name="domain_userid"), ], owner="user@company.com", ) ``` This intervention activates after a user views more than three products, indicating high purchase intent. ### Free shipping intervention This intervention encourages larger purchases by offering free shipping: ```python free_shipping_intervention = RuleIntervention( name="free_shipping", description="Show banner to users who plan to spend at least $100.", criteria=InterventionCriterion( attribute="ecom_attributes:total_cart_value", operator=">", value=100, ), target_attribute_keys=[ LinkAttributeKey(name="user_id"), LinkAttributeKey(name="domain_userid"), ], owner="user@company.com", ) ``` This triggers when the cart value exceeds $100, incentivizing users to complete their purchase. ## Publish the interventions Publish all three interventions to your Signals Sandbox: ```python sp_signals.publish( [ cart_abandonment_intervention, discount_intervention, free_shipping_intervention, ] ) ``` Once published, these interventions are active and will trigger in real time as users interact with your application. In the next section, you'll use the demo e-shop application to see these interventions in action. The app uses the [browser tracker plugin](/docs/signals/interventions/subscribe/#using-the-browser-tracker-plugin) to automatically receive and display intervention banners when they trigger. --- # Connect to Signals within a notebook using the Signals Python SDK > Enable Signals through Snowplow Console or deploy a Signals Sandbox instance and obtain your access credentials. > Source: https://docs.snowplow.io/tutorials/signals-interventions/setup/ The first step is to set up your Signals connection. Follow the instructions in the [Signals connection documentation](/docs/signals/connection/) for your chosen deployment method: - **Snowplow Console**: [enable Signals through Snowplow Console](/docs/signals/connection/#snowplow-console) if you have a Snowplow account - **Signals Sandbox**: [deploy a Sandbox instance](/docs/signals/connection/#signals-sandbox) to experiment without a Snowplow account Once your connection is set up, gather the required credentials as described in the [connection credentials section](/docs/signals/connection/#connection-credentials). ## Set up your Jupyter notebook environment You can use either Google Colab or a local Jupyter notebook environment for this tutorial. If you want to skip ahead, you can make use of the following notebook that contains all the Python code you'll need for this tutorial: 1. [Python notebook on Google Colab](https://colab.research.google.com/github/snowplow-incubator/signals-interventions-demo/blob/main/attributes_and_interventions.ipynb) 2. [Jupyter notebook on GitHub](https://github.com/snowplow-incubator/signals-interventions-demo/blob/main/attributes_and_interventions.ipynb) ### Using Google Colab You can use the provided notebook, or create your own. To create a new notebook, open a new notebook at [Google Colab](https://colab.research.google.com/). You'll need to add credentials as Colab secrets. Click the key icon in the left sidebar, and add the required secrets: **Snowplow Console:** - `SP_API_URL`: your Signals API URL - `SP_API_KEY`: your API key - `SP_API_KEY_ID`: your API key ID - `SP_ORG_ID`: your Snowplow Console organization ID **Signals Sandbox:** - `SP_API_URL`: your Profiles API URL - `SP_SANDBOX_TOKEN`: your Sandbox Token *** When you run the notebook, it will ask for access to the secrets. Choose to grant access. If you're using your own notebook, follow these steps: 1. Install the Signals Python SDK: ```python %pip install snowplow-signals ``` 2. Load your credentials in the notebook: **Snowplow Console:** ```python from google.colab import userdata import os os.environ["SP_API_URL"] = userdata.get('SP_API_URL') os.environ["SP_API_KEY"] = userdata.get('SP_API_KEY') os.environ["SP_API_KEY_ID"] = userdata.get('SP_API_KEY_ID') os.environ["SP_ORG_ID"] = userdata.get('SP_ORG_ID') ``` **Signals Sandbox:** ```python from google.colab import userdata import os os.environ["SP_API_URL"] = userdata.get('SP_API_URL') os.environ["SP_SANDBOX_TOKEN"] = userdata.get('SP_SANDBOX_TOKEN') ``` *** ### Using local Jupyter Navigate into your working directory and environment, then follow these steps: 1. Install Jupyter and the Signals SDK: ```bash pip install jupyter snowplow-signals python-dotenv ``` 2. Create a `.env` file in your working directory with your credentials: **Snowplow Console:** ```text SP_API_URL=your_signals_api_url SP_API_KEY=your_api_key SP_API_KEY_ID=your_api_key_id SP_ORG_ID=your_organization_id ``` **Signals Sandbox:** ```text SP_API_URL=your_profiles_api_url SP_SANDBOX_TOKEN=your_sandbox_token ``` *** 3. Start Jupyter notebook: ```bash jupyter notebook ``` 4. In your notebook, load the environment variables: ```python from dotenv import load_dotenv load_dotenv() ``` ## Connect to Signals Now you're ready to connect to your Signals instance using the Python SDK. **Snowplow Console:** ```python from snowplow_signals import Signals import os sp_signals = Signals( api_url=os.environ["SP_API_URL"], api_key=os.environ["SP_API_KEY"], api_key_id=os.environ["SP_API_KEY_ID"], org_id=os.environ["SP_ORG_ID"], ) ``` **Signals Sandbox:** ```python from snowplow_signals import SignalsSandbox import os sp_signals = SignalsSandbox( api_url=os.environ["SP_API_URL"], sandbox_token=os.environ["SP_SANDBOX_TOKEN"], ) ``` > **Tip:** The `SignalsSandbox` class is specifically designed for Sandbox environments. For production Snowplow deployments, you would use the `Signals` class instead, which requires API keys. *** You're now ready to start defining attributes and interventions in Signals. --- # Learn how to implement real-time interventions in an ecommerce app using Signals > Get hands-on with Snowplow Signals, using Snowplow Console or the Signals Sandbox trial environment, to create real-time personalization. > Source: https://docs.snowplow.io/tutorials/signals-interventions/start/ Welcome to the **Snowplow Signals** tutorial. [Snowplow Signals](/docs/signals/introduction/) is a real-time personalization engine for customer intelligence, built on Snowplow's behavioral data pipeline. It allows you to compute, access, and act on in-session stream and historical user data, in real time. This tutorial provides a hands-on introduction to Signals. You'll use Python to programmatically define attributes, services, and interventions, then test them with a demo ecommerce application. You can follow this tutorial using either: - **Snowplow Console**: if you have a Snowplow account with Signals enabled - **Signals Sandbox**: a trial environment where you can experiment without needing a Snowplow account This tutorial should take approximately 20-30 minutes to complete. ## What you'll learn In this tutorial, you will: - Set up your Signals connection (via Console or Sandbox) - Use the Signals Python SDK to define attributes, services, and interventions - Calculate real-time user behavior metrics from ecommerce events - Create intervention rules that trigger personalized experiences - Test your Signals configuration with an interactive demo application ## Deployment options ### Snowplow Console If you have a Snowplow account, you can [enable Signals through Snowplow Console](/docs/signals/connection/#snowplow-console). This provides: - Integration with your existing Snowplow data pipeline - Access to your production behavioral data - Full production capabilities and support ### Signals Sandbox [Signals Sandbox](https://try-signals.snowplow.io/) provides a temporary Signals deployment that you can use to explore the platform without needing a Snowplow account. The Sandbox includes: - A dedicated Profiles API endpoint for your attributes and interventions - A Snowplow Collector endpoint for tracking events - Access credentials (Sandbox Token) for authentication - Limited-time access to experiment with Signals capabilities > **Note:** The Signals Sandbox is designed for experimentation and learning. For production use cases, you'll need a [Snowplow](/docs/get-started/private-managed-cloud/) account with [Signals enabled](/docs/signals/connection/). ## Prerequisites This tutorial requires: - Python 3.7 or higher - A Jupyter notebook environment (local or Google Colab) - Basic familiarity with Python - A modern web browser for testing the demo application - Either a Snowplow account with Signals enabled, or access to Signals Sandbox --- # Test your Signals interventions in the demo application > Use the interactive demo e-shop to see your Signals configuration in action. > Source: https://docs.snowplow.io/tutorials/signals-interventions/test-with-demo-app/ Now that you've defined attributes and interventions, it's time to see them work in a real application. We've prepared a demo React application with Snowplow tracking that can integrate with your Signals instance to demonstrate real-time personalization. The demo application features: - **Product catalog**: products fetched from an external API - **Snowplow tracking**: the [JavaScript tracker](/docs/sources/web-trackers/) with the [ecommerce plugin](/docs/sources/web-trackers/tracking-events/ecommerce/) - **Intervention display**: the [browser tracker plugin](/docs/signals/interventions/subscribe/#using-the-browser-tracker-plugin) to receive and show intervention banners - **Logging**: the app prints to the console when it tracks an event, or receives an intervention - **Reset functionality**: ability to clear your user data and start fresh ## Access the demo application The demo application is deployed at: <https://snowplow-incubator.github.io/signals-interventions-demo/> When you first open the app, you'll see a configuration screen. ![Screenshot showing the configuration page of the demo app.](/assets/images/demo-configure-0759b8ea89e72ae691dbe3b1dcbb98ba.png) Enter your credentials from the setup step. **Snowplow Console:** - **Collector URL**: your Snowplow Collector endpoint for the pipeline you enabled Signals on (see the "Pipelines" section in Snowplow Console) - **Profiles API URL**: your Signals API URL (from Snowplow Console > **Signals** > **Overview**) **Signals Sandbox:** - **Collector URL**: your Snowplow Collector endpoint provided by Signals Sandbox - **Profiles API URL**: your Signals Profiles API endpoint provided by Signals Sandbox *** The app will store these values in your browser's local storage, so you won't need to enter them again during this session. Click **Start Shopping** to proceed to the e-shop. ## Test your interventions Follow these steps to trigger each intervention. ### Trigger the discount intervention 1. Browse the product catalog 2. Click on different products to view their details 3. After viewing four products, you should see a banner appear at the top: - **Message**: "10% off your next purchase!" - **Code**: `SAVE10` ### Trigger the cart abandonment intervention 1. Click **Add to Cart** on any product 2. After adding at least one item, you should see: - **Message**: "Don't forget your items in cart!" ### Trigger the free shipping intervention 1. Add multiple items to your cart until the total value exceeds $100 2. You should see: - **Message**: "Free shipping on orders over $100!" - **Code**: `FREE` ![Screenshot showing a triggered intervention in the demo app.](/assets/images/demo-intervention-e9c05f07f0877ce583bd66b4085131c1.png) ## Inspecting events, attributes and interventions using Snowplow Inspector We highly recommend installing the [Snowplow Inspector](/docs/testing/snowplow-inspector/) Chrome extension that lets you validate which Snowplow events are triggering on the website and also check the values of Signals attributes and interventions. You can install the extension from the [Chrome Web Store](https://chromewebstore.google.com/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm). Once you add the extension to Chrome, you can view it by [opening Developer Tools](https://developer.chrome.com/docs/devtools/open/) (usually `Ctrl`+`Shift`+`I` or on Mac `Cmd`+`Option`+`I`), where it has its own tab named 'Snowplow'. Right away, you will see the tracked events as you open a product or add a product to cart. ![Screenshot showing the Snowplow Inspector extension with tracked events from the demo app.](/assets/images/inspector-events-cc8f7737b748821cc26528aa7c0f0be0.png) **Snowplow Console:** To configure the extension to show Signals attributes and interventions: 1. In Chrome, navigate to [the Extensions settings page](chrome://extensions/) 2. Click on "Details" for the Snowplow Inspector extension 3. Scroll down and click on "Extension options" 4. Click on "Add new organization" and enter your Organization ID and API key details. **Signals Sandbox:** To configure the extension to show Signals attributes and interventions: 1. In Chrome, navigate to [the Extensions settings page](chrome://extensions/). 2. Click on "Details" for the Snowplow Inspector extension. 3. Scroll down and click on "Extension options". 4. Open "Advanced options" and enter your Signals Sandbox URL and token. ![Screenshot showing the Snowplow Inspector extension options page with fields to enter Signals API URL and Token.](/assets/images/inspector-advanced-options-372d76b109ed9b6eafc501325134ee88.png) *** Now, in addition to seeing the Snowplow events tracked, you will be able to see the attributes and interventions calculated by Signals after switching the section on the left side of the extension. ![Screenshot showing the Snowplow Inspector extension with Signals attributes displayed.](/assets/images/inspector-attributes-067b8df903f7e2a74c7c8deb6dc37b3a.png) ![Screenshot showing the Snowplow Inspector extension with Signals interventions displayed.](/assets/images/inspector-interventions-6c70a032c7a0aec14f6f79c4521336bf.png) ## Behind the scenes Here's what happens when you interact with the demo app: 1. **Event tracking**: when you view a product or add to cart, the JavaScript tracker sends an ecommerce event to your Sandbox Collector 2. **Attribute calculation**: Signals processes the event and updates your user attributes in real time 3. **Intervention evaluation**: Signals checks if any intervention criteria are now met 4. **Delivery**: if an intervention triggers, Signals sends it to the browser via Server-Sent Events (SSE) 5. **Display**: the browser tracker plugin receives the intervention and your application displays the appropriate banner ## Reset and experiment You can click **Reset User Data** to clear your current user session and start testing again with a fresh profile. This generates a new `domain_userid` so you can re-trigger interventions. The demo app is open source. You can explore the implementation in the [GitHub repository](https://github.com/snowplow-incubator/signals-interventions-demo): - **Tracker setup**: see how the Snowplow tracker and Signals plugin are initialized in `app/src/snowplow.ts` - **Intervention handling**: see how interventions are received and displayed in `app/src/App.tsx` - **Attribute definitions**: review the Python notebook `attributes_and_interventions.ipynb` that you followed in this tutorial You've successfully configured Signals to calculate real-time attributes and trigger personalized interventions based on user behavior. --- # Call the API to see prospect scores in your browser > Deploy the prospect scoring API, and display real-time conversion predictions in the browser console. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/access-api-browser/ The final requirement is to see the prospect scores and predictions in the browser. In this tutorial, we'll call the API every 10 seconds. You need an API endpoint that you can access from your local machine, or from JavaScript in the browser. In the previous step we used TryCloudflare tunnels to expose Colab notebook behind a public HTTPS endpoint. In the server output you will find your public URL similar to this one: ```text ... * Running on https://aaa-bbb-ccc-ddd.trycloudflare.com ``` ## Test with cURL Test the endpoint using `cURL`, passing in your `domain_userid` that you got earlier using the Snowplow Inspector. Update the URL with your TryCloudflare tunnel URL. ```bash curl -X POST \ "https://aaa-bbb-ccc-ddd.trycloudflare.com/predict" \ -H "Content-Type: application/json" \ -d '{"domain_userid": "8e554b10-4fcf-49e9-a0d8-48b6b6458df3"}' ``` You should see an output like this: ```json { "score": 0.35080923483101656, "scoring_attributes": { "num_customers_views": 0, "num_page_views": 0, "num_pricing_views": 0 }, "signals": { "domain_userid": "00000000-1111-2222-3333-444455556666", "num_customers_views": null, "num_page_views": null, "num_pricing_views": null } } ``` ## See scores in the browser Finally, run the code below in your browser console on your website to see your live prospect score. The code retrieves your `domain_userid` directly from the tracker, and calls the intermediary API to get the scores. You may need to update this if your tracker name is different. Check the tracker name in outbound events using the Snowplow Inspector. Run this in your browser console to see your predictions: ```js let api_url = "https://aaa-bbb-ccc-ddd.trycloudflare.com/predict"; // UPDATE THIS let tracker_name = "sp"; // MAYBE UPDATE THIS // Calls the API every 10s from the front-end setInterval(function () { // assuming the Snowplow tracker is available at 'window.snowplow(...)' window.snowplow(function () { // Gets domain_userid from the tracker instance var sp = this[tracker_name]; var domainUserId = sp.getDomainUserId(); // Calls the API fetch(api_url, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ domain_userid: domainUserId }) }) .then(response => response.json()) .then(result => { console.log("Prediction: ", domainUserId, " - ", result.score); // Acts on prediction if (result.score >= 0.9) console.log('Prospect is likely to convert!'); }) .catch(console.error); }); }, 10 * 1000); ``` Every 10 seconds it will print out the prediction score that you'll convert, as well as the full API response. Adjust the timing interval to call the endpoint APIs as often as needed in your use case. ![](/assets/images/console_output-fea3b67475be2c3cdbc93d115d89dcb5.png) In a real use case, you'd be able to take actions based on these scores and predictions. --- # Conclusions and next steps from the Signals ML prospect scoring accelerator > Complete the Snowplow Signals machine learning prospect scoring tutorial, and explore advanced real-time personalization use cases. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/conclusion/ In this accelerator you've learned how to build a prospect scoring system using Snowplow Signals together with a machine learning model, your own website, and your own Snowplow data. What you achieved: - Used Signals to calculate attributes - Scored prospects using an ML model - Served live prospect scores in the browser This accelerator is a starting point for exploring and using Signals APIs for your own needs and use cases. ![](/assets/images/console_output-fea3b67475be2c3cdbc93d115d89dcb5.png) ## Next steps Here are some ideas for further exploration: - Define more Signals attributes that are specific for your use cases and tracking - Act on the prospect scores and predictions in your application - Use attributes to trigger actions and decisions without the ML step --- # Create an API endpoint to serve the prospect scores in real time > Build an API endpoint to serve Snowplow Signals attributes and ML predictions for real-time prospect scoring. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/create-endpoint/ You now have attribute values in your Profiles Store, and a trained ML model. The next step is to create and deploy an endpoint to serve attributes and predictions. It needs to do the following: - Get calculated attribute values from Signals - Score the values using the model - Return the results This tutorial uses Flask together with TryCloudflare tunnels to expose the API endpoint outside of the Colab notebook environment. ## Install dependencies Run `%pip install flask-cloudflared` to install the required libraries. ## Create ML scoring endpoint Start by loading your model, and defining the methods needed to retrieve and process your Signals data. ```python from flask import Flask, request from flask_cloudflared import run_with_cloudflared # Define and launch Colab API Proxy app = Flask(__name__) run_with_cloudflared(app) # Open Cloudflared demo tunnel # Load model model = joblib.load("model.joblib") # Process the Signals data for individual domain_userids def get_duid_values(duid: str): # Retrieve attributes from Signals response = sp_signals.get_service_attributes( name="prospect_scoring_tutorial_service", attribute_key="domain_userid", identifier=duid, ) signals_df = pd.DataFrame([response]) # Prepare ML dataframe ml_df = signals_df.fillna(0).reindex(columns=x_columns, fill_value=0) return (signals_df, ml_df) # Score the data against the model def get_predictions(df): return float(model.predict_proba(df)[:, 1][0]) @app.route("/predict", methods=['POST']) def predict(): input_dict = request.get_json() # Parse JSON input # Get Signals data and prepare dataframe for scoring signals_df, ml_df = get_duid_values(input_dict['domain_userid']) # Score dataframe using the trained model prediction = get_predictions(ml_df) # Return the result print(f"P: {round(prediction, 4)} - {input_dict}") return { "signals": signals_df.to_dict(orient='records')[0], "scoring_attributes": ml_df.to_dict(orient='records')[0], "score": prediction } app.run() ``` This `/predict` endpoint does four things: 1. Receives a `domain_userid` 2. Calls the Signals API to get the current attribute values, using the `prospect_scoring_tutorial_service` service 3. Scores the attribute values using the ML model 4. Returns Signals attributes, and ML prediction score --- # Define page view attributes for real-time prospect scoring > Define behavioral data attributes in Snowplow Signals for prospect scoring, including page views, sessions, and conversion events. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/define-prospect-attributes/ To use Signals, you need to define which attributes to calculate, and then apply the configuration. Signals will calculate the attributes from your real-time event stream. Let's imagine that for this use case our data science team came up with the following set of attributes. They'll be calculated against the `domain_userid` device attribute key. Choosing this attribute key allows Signals to calculate attributes from events across multiple sessions for each prospect. | Feature name | Description | Type | Aggregation | | --------------------- | ------------------------------------------------------- | ---- | ----------- | | `num_page_views` | Number of `page_view` events | int | `counter` | | `num_pricing_views` | Number of `page_view` events of the `/pricing` page | int | `counter` | | `num_customers_views` | Number of `page_view` events of the `*customers*` pages | int | `counter` | ## Install Signals Python SDK Run `%pip install snowplow-signals`. ## Define attributes Next, define the attributes to calculate. ```python # Imports from snowplow_signals import Attribute, Criteria, Criterion, AtomicProperty, EntityProperty # Define an event sp_page_view = Event( vendor="com.snowplowanalytics.snowplow", name="page_view", version="1-0-0" ) # Define attributes num_page_views = Attribute( name="num_page_views", type="int32", events=[sp_page_view], aggregation="counter" ) num_pricing_views = Attribute( name="num_pricing_views", type="int32", events=[sp_page_view], aggregation="counter", criteria=Criteria( all=[ Criterion.like( AtomicProperty(name="page_url"), "%pricing%" ) ] ) ) num_customers_views = Attribute( name="num_customers_views", type="int32", events=[sp_page_view], aggregation="counter", criteria=Criteria( all=[ Criterion.like( AtomicProperty(name="page_url"), "%customers%" ) ] ) ) ``` Group the attributes into an attribute group with the `domain_userid` device attribute key. ```python from snowplow_signals import StreamAttributeGroup, domain_userid user_attributes_group = StreamAttributeGroup( name='prospect_scoring_tutorial', owner='your_email@example.com', version=1, attribute_key=domain_userid, attributes=[ num_page_views, num_pricing_views, num_customers_views, ], ) ``` Next, define a service for retrieving the calculated attributes. Again, provide your own email address for the `owner` field. ```python from snowplow_signals import Service prospect_scoring_tutorial_service = Service( name='prospect_scoring_tutorial_service', owner='your_email@example.com', attribute_groups=[user_attributes_group], ) ``` **Details** For Signals Customers: test attribute group on your warehouse data If you're using Signals in the Console, you can test the attribute outputs on a subset of recent event data. The `test` command uses the last hour of data from your atomic events table. Here we're restricting the results to events with the application ID `website`. This filtering is optional. ```python sp_signals_test = sp_signals.test( attribute_group=user_attribute_group, app_ids=["website"] ) sp_signals_test ``` The result should look similar to this: ![](/assets/images/signals_test_output-0fd3caaf6492e3f5ea9ef9ed0b940c74.png) ## Deploy configuration to Signals Apply the attribute group and service configurations to Signals. ```python applied = sp_signals.publish([user_attribute_group, prospect_scoring_tutorial_service]) # This should print "2 objects applied" print(f"{len(applied)} objects applied") ``` Signals will start populating your Profiles Store with attributes calculated from your real-time event stream. ## Look at your attributes > **Note:** This section expects that you [integrated](/docs/sources/web-trackers/quick-start-guide/) `sp.js` into a website and have events flowing to the collector. If you're using Signals Sandbox your collector URL for `sp.js` is next to the Signals token. Go to your website, and use the [Snowplow Inspector](/docs/testing/snowplow-inspector/) browser plugin to find your own `domain_userid` in outbound web events. ![](/assets/images/get_domain_userid-8cbe50832395dfe0e5e83cbdb6581064.png) Use your `domain_userid` to retrieve the attributes that Signals has calculated just now from your real-time event stream. ```python sp_signals_result = sp_signals.get_service_attributes( name="prospect_scoring_tutorial_service", attribute_key="domain_userid", identifier="00000000-1111-2222-3333-444455556666", # UPDATE THIS ) sp_signals_result ``` The result should look something like this: ```yaml { 'domain_userid': '00000000-1111-2222-3333-444455556666', 'num_page_views': None, 'num_pricing_views': None, 'num_customers_views': None } ``` --- # Learn how to score prospects in real time using Signals and machine learning > Build a real-time prospect scoring system using Snowplow Signals and machine learning to predict conversion likelihood. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/intro/ Welcome to the [Snowplow Signals](/docs/signals/introduction/) real-time prospect scoring tutorial. Snowplow Signals is a real-time personalization engine for customer intelligence, built on Snowplow's behavioral data pipeline. It allows you to compute, access, and act on in-session stream and historical user data, in real time. This guide will through the process of building a real-time prospect scoring system using Signals together with a machine learning model. You'll learn how to leverage Snowplow event data to predict a prospect's likelihood to convert on your website, and how to trigger personalized engagements. Use this tutorial as a starting point for how to integrate Signals data with any ML use cases or other back-end services. The tutorial uses the Snowplow marketing website as an example. Follow along using your own website and Snowplow data. At the end of the tutorial, you will: - See live prospect attribute updates in the browser console - Score prospects using an ML model - Be ready to use the outputs to drive decisions on your website ## Business case overview Prospects on the Snowplow marketing website visit product and pricing pages, watch videos, etc. The key call to action is a form submission to request a demo. We want to predict if a specific prospect is likely to submit a form in the next hour, given their behavior before the prediction moment. ![](/assets/images/prediction-structure-6110e68ef2574e58b5b22a2ad73c2390.png) For simplicity in this tutorial, we'll score prospects every 10 seconds as they browse the website. Depending on your requirements, you could score prospects at the start of each `page_view`, after certain events, or at some other time interval. We're calculating aggregated attributes based off real-time stream event data, so won't have much user history to work with initially. Over time, as prospects visit and revisit the website, the attributes will become more meaningful. ## Prerequisites 1. Please follow [Connect to Signals](/docs/signals/connection/) page to setup the Signals connection (Console or Sandbox) 1. Signals Sandbox: follow the Sandbox setup flow 2. Signals Customers: follow the Console setup flow 2. Open this [Google Colab](https://colab.research.google.com/github/snowplow-incubator/signals-notebooks/blob/main/web/web_prospect_scoring_end_to_end.ipynb) notebook to follow along 3. \[Optional] Integrate Snowplow into your website. If you use Signals Sandbox - find your collector URL in the UI ## Architecture The system consists of three main blocks: 1. **Your marketing website**: as users browse the website, Snowplow events are sent to the Snowplow Collector 2. **Snowplow Infrastructure**: the Collector captures the events, and Signals calculates aggregated user attributes in real time e.g., `num_page_views`, `num_customers_views` 3. **Intermediary `/predict` API**: an API that calls the Signals API to get the latest attributes, runs an ML model on the Signals output, and sends the response back ![](/assets/images/solution_overview-268d0581d22caed9db29407d4260f603.png) > **Note:** You can replace the ML model with any other back-end system that you'd use to act on live prospects' attributes. --- # Add credentials to the Signals prospect scoring ML notebook > Configure Google Colab notebook with Snowplow Signals credentials for the prospect scoring machine learning tutorial. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/setup/ This tutorial uses this Google [Colab notebook](https://colab.research.google.com/github/snowplow-incubator/signals-notebooks/blob/main/web/web_prospect_scoring_end_to_end.ipynb). For ML training data we will use an artificial set of Snowplow events. Feel free to replace with your warehouse connection to connect to your own dataset. Start by configuring your Signals credentials in the notebook. ![](/assets/images/colab_credentials-75693a10119a3fbb3b31d8567f3c1a7c.jpeg) Here's a list of all the credentials you may need to configure, saved as variables: ```python from google.colab import userdata # Snowplow Signals Sandbox credentils SP_SANDBOX_URL = userdata.get('SP_SANDBOX_URL') # https://{{123abc}}.svc.snplow.net SP_SANDBOX_TOKEN = userdata.get('SP_SANDBOX_TOKEN') # 12345678-0000-1111-2222-123456789012 # Snowplow Signals Console credentials SP_API_URL = userdata.get('SP_API_URL') # Signals API URL SP_API_KEY = userdata.get('SP_API_KEY') # Signals API key SP_API_KEY_ID = userdata.get('SP_API_KEY_ID') # Signals API key ID SP_ORG_ID = userdata.get('SP_ORG_ID') # Snowplow org ID ``` Once you've added the secrets, start working through the tutorial. If you prefer to run the cells in one go with Run all, update your details in the required places first - they're marked with `UPDATE THIS`. --- # Prepare a training dataset and train the ML prospect scoring model > Create and train a machine learning model using historical Snowplow behavioral data to predict prospect conversion. > Source: https://docs.snowplow.io/tutorials/signals-ml-prospect-scoring/train-ml-prospect-scoring/ As prospects browse your website, Signals will calculate the aggregated attributes in real time. We want to score the combination of these attributes using an ML model to better understand if a specific prospect is likely to submit a form. Here's the prediction structure and timeline: ![](/assets/images/prediction-structure-6110e68ef2574e58b5b22a2ad73c2390.png) The next task is to prepare historical data resembling the same Signals features, and train a LogisticRegression model on top. ## Prepare training dataset For this tutorial we will use an artificial set of Snowplow events in a CSV that resemble web traffic. We will connect to it using duckdb and model data using SQL. You can replace this section with your warehouse data, just update the connection block. SQL syntax should translate to major warehouses easily. Here's the SQL: ```sql with -- Change to your events table events as ( select * from read_csv_auto('{csv_filename}') where app_id = 'website' -- filter for your app_ids and domain_userid is not null -- filter to traffic with domain_userids and event_name in ('page_view', 'submit_form') -- filter the events you need -- and derived_tstamp >= dateadd('day', -90, current_date) -- filter by time -- filter out bots, new/returning users, etc ), -- Filter out post-conversion events post_conv as ( select domain_userid, min(derived_tstamp) as first_conv from events where event_name = 'submit_form' group by 1 ), -- Prepare the target column targets_as_of_event as ( -- Target: will this person 'submit_form' in the next 1 hour? select -- identifiers er.event_id, -- target count_if(ef.event_name = 'submit_form') > 0 as target_had_submit_form_next1h, from events er left join events ef on er.domain_userid = ef.domain_userid and er.derived_tstamp < ef.derived_tstamp -- only future events and datediff('second', er.derived_tstamp, ef.derived_tstamp) <= 60 * 60 -- only the next 1h of events group by er.event_id ), -- Prepare training features and dataset final_training as ( select er.domain_userid, er.derived_tstamp, er.event_name, coalesce(count_if(eh.event_name = 'page_view'), 0) as num_page_views, coalesce(count_if(eh.event_name = 'page_view' and eh.page_url like '%pricing%'), 0) as num_pricing_views, coalesce(count_if(eh.event_name = 'page_view' and eh.page_url like '%customers%'), 0) as num_customers_views, coalesce(t.target_had_submit_form_next1h, false) as target_had_submit_form_next1h, from events er left join events eh on er.domain_userid = eh.domain_userid and eh.derived_tstamp < er.derived_tstamp left join targets_as_of_event t on er.event_id = t.event_id left join post_conv p on er.domain_userid = p.domain_userid and er.derived_tstamp >= p.first_conv group by all having not p.domain_userid is not null -- remove post-conversion events ) select * from final_training ``` Next we will connect to the CSV to run the query and retrieve data into a pandas DataFrame: ```python csv_filename = 'https://github.com/snowplow-incubator/signals-notebooks/raw/refs/heads/main/web/sample_events.csv.gz' query = f"""...""" import duckdb import pandas as pd conn = duckdb.connect(database=':memory:') db_df = conn.sql(query).df() conn.close() db_df ``` > **Info:** Adjust the query/connection here if you want to try the query on your own warehouse. duckdb SQL syntax translates easily to all the major warehouses. ### Best practices for atomic events training datasets 1. Check that key behaviors you want to use for predictions are captured in your events, and exclude events not needed for your use case like `application_error`. 2. Choose a subset of `domain_userid`s (or your other entities, depending on what you're using) to include in training. For example, sometimes you may want to exclude bots, new/returning users, specific segments, etc. 3. Choose historical training time periods that allow your `domain_userid`s (or other entities) to reach the target. - Let's say you expect your `domain_userid` to convert in 7 days. Then choose `domain_userid`s who first appeared between 90 and 30 days ago, and set a cutoff of 7 days for them to convert. Don't just choose the last 30 days, as a `domain_userid` who appeared yesterday wouldn't have enough time to reach the cutoff date. - If you're trying to predict churn, give your `domain_userid`s enough time to become "churned". For example, you could make an assumption that anyone who doesn't have events for 30 days is churned. So you'd need to train only on data that happened up until `today - 30d`. 4. Make sure there's no target leakage in your dataset. The most common pitfalls are: - Including events in historical training that happened after the prediction point in the journey. - Including information that wouldn't be available at the time of prediction in training, e.g., accidentally enriching the training dataset with some "as of today" data from the warehouse or CRM. ## Train and evaluate the model In this tutorial, we're training a regular LogisticRegression model on top of numerical features. > **Note:** Use this template to adjust to your own ML needs. Training an ML model for your specific use cases goes far beyond this tutorial. In the next code block we prepare data for training, fit the model, and persist the model binary to use it later in the API. ```python import joblib from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression # preprocessing x_columns = [ 'num_page_views', 'num_pricing_views', 'num_customers_views', ] y_column = 'target_had_submit_form_next1h' seed = 42 model = LogisticRegression(random_state=seed, class_weight='balanced') # Split, Train, and Evaluate X = db_df[x_columns] y = db_df[y_column] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=seed, stratify=y) model.fit(X_train, y_train) y_pred = model.predict(X_test) y_prob = model.predict_proba(X_test)[:, 1] # Export joblib.dump(model, "model.joblib") ``` Follow along in the notebook to evaluate your model's performance. ![](/assets/images/model_evaluate-9b22ced5cfe609c9feeb05ac047c0eb2.png) > **Note:** This model is built on an artificial subset of events. In real-life use-cases you'll most likely use a lot more Signals attributes that capture key user's behaviours including geo, useragent, traffic sources, various categorical attributes, and more. --- # Conclusions and next steps from the Signals personalized travel site accelerator > Summary of what you've accomplished and recommended next steps for expanding your Snowplow Signals implementation. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/conclusion/ You've successfully built a personalization system using Snowplow Signals that demonstrates the power of real-time behavioral data. Your implementation captures user interactions, processes them into meaningful attributes, and applies these insights to create personalized experiences. You've learned how to transform raw behavioral events into actionable personalization signals. You've seen how micro-segmentation based on user behavior can drive content customization, from changing images and descriptions to reordering information based on user preferences. The integration with an AI chatbot shows how behavioral data can enhance conversational experiences, making them more contextual and relevant. The intervention system you implemented showcases proactive personalization, where the system anticipates user needs and responds accordingly. This represents a shift from reactive to predictive user experiences, where the application adapts to user behavior patterns rather than waiting for explicit requests. Your personalization system now: - Processes behavioral data in real-time - Segments users dynamically based on their interactions - Customizes content presentation based on demonstrated preferences - Provides personalized chatbot responses using behavioral context - Triggers proactive interventions based on user engagement patterns ## Next steps You can expand your personalization capabilities by exploring these opportunities: - **Advanced attribute definitions**: create more sophisticated attributes that combine multiple behavioral signals or include temporal elements - **Multi-session personalization**: extend attributes to persist across sessions, building long-term user preference profiles - **A/B testing integration**: use Signals interventions to deliver different personalization strategies to different user segments - **Cross-channel personalization**: apply the same behavioral insights to personalize email campaigns, push notifications, or other customer touchpoints - **Enhanced chatbot capabilities**: integrate more complex conversational flows that adapt based on user segment - **Performance optimization**: implement caching strategies for frequently accessed attributes and optimize attribute calculations for high-traffic scenarios --- # Define travel preference attributes using the Signals Python SDK > Define behavioral attributes using the Snowplow Signals Python SDK to capture user preferences from website interactions. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/defining-attributes/ You'll now define behavioral attributes that capture different types of travel preferences based on how users interact with the website. These attributes will serve as the foundation for both on-site content personalization and AI chatbot customization. The attributes you create will track user engagement with different types of content, destinations, and features. For example, users who frequently view food-related content will have their culinary interest attribute incremented, while those who browse luxury destinations will see their luxury preference attribute increase. ## Open the Jupyter notebook You'll use the Snowplow Signals Python SDK in a Jupyter notebook to define your attributes. You can run the notebook directly using Google Colab [here](https://colab.research.google.com/github/snowplow/documentation/blob/main/static/notebooks/signals-personalize-travel-demo.ipynb), or download it locally. ## Set up your credentials You'll need the same connection credentials you used in the previous step: **CDI:** ```python API_URL = 'example.signals.snowplowanalytics.com' API_KEY = 'YOUR_API_KEY' API_KEY_ID = 'YOUR_API_KEY_ID' ORG_ID = 'YOUR_ORG_ID' ``` **Sandbox:** ```python API_URL = 'you.signals.snowplowanalytics.com' ACCESS_TOKEN = '' ``` *** ## Define attributes Run the notebook cell that defines the attributes. The notebook creates a series of attributes that represent different user preferences and behaviors, including interests in various types of travel experiences like luxury, budget, adventure, and culinary. The attributes are based on these content tags: ```python cultural_explorer_tags = ["culture", "history", "heritage", "ancient", "temples", "art", "traditional"] modern_urbanite_tags = ["urban", "nightlife", "shopping", "modern", "architecture"] tranquil_seeker_tags = ["nature", "peaceful", "wellness", "beaches", "mountains", "river", "wellness"] family_fun_tags = ["family-friendly", "beaches", "nature", "food", "mountains", "culture"] culinary_tourist_tags = ["food", "street food", "multicultural", "traditional", "urban", "shopping"] ``` The notebook defines these counter attributes: - `page_view_count` - `dest_page_view_count` - `family_destination_count` - `cultural_explorer` - `modern_urbanite` - `tranquil_seeker` - `family_fun` - `culinary_tourist` - `budget_conscious` - `luxury_inclined` And these attributes with the `last` aggregation: - `preferred_experience_length` - `latest_schedule` Running this cell defines the attributes locally but doesn't yet publish them to your Signals instance. ## Create an attribute group Run the next notebook cell to define an attribute group for your attributes. This creates a `StreamAttributeGroup` with `domain_sessionid` as the attribute key: ```python session_attributes_group = StreamAttributeGroup( name="travel_view", version=1, attribute_key=domain_sessionid, attributes=[page_view_count, dest_page_view_count, family_destination_count, cultural_explorer, modern_urbanite, tranquil_seeker, family_fun, culinary_tourist, preferred_experience_length, budget_conscious, luxury_inclined, latest_schedule], owner = 'you@email.com', ) ``` ## Create a service Run the notebook cell that defines a service to expose the attribute group via the API: ```python travel_service = Service( name="travel_service", description="A service for our travel demo website.", attribute_groups=[session_attributes_group], owner='you@email.com' ) ``` ## Publish to Signals Finally, run the cell that publishes the attribute group and service to your Signals instance: ```python response = sp_signals.publish([session_attributes_group, travel_service]) print(response) ``` Signals will start processing attributes from your real-time event stream. --- # Define a Signals intervention to automatically start the travel AI agent > Use Snowplow Signals interventions to automatically trigger chatbot appearance based on specific user behaviors. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/interventions/ Use Snowplow Signals interventions to control when the chatbot appears, transforming reactive customer support into proactive assistance based on behavioral triggers. In the previous section, you personalized chatbot responses based on user behavior. This section shows how to use interventions to automatically present the chatbot when users demonstrate specific behaviors that suggest they might benefit from help. Instead of waiting for users to click the chat icon, you can automatically trigger the agent to appear when users exhibit certain behaviors, such as browsing extensively without taking action. ## How it works You'll create an intervention in Signals that triggers the agent to appear when a user has viewed more than three destination pages in a single session. This demonstrates how you can use interventions to control when an agent appears based on user behavior. When the user views their fourth destination page, the agent will automatically appear and offer assistance. ## Create the intervention Open your Jupyter notebook and run the final cell. This creates an intervention based on the `dest_page_view_count` attribute. ```python from snowplow_signals import RuleIntervention, InterventionCriterion, LinkAttributeKey destination_assistance_intervention = RuleIntervention( name="destination_help", owner="you@email.com", description="Assist the user if they are looking at 3 or more destinations in a given session.", target_attribute_keys=[LinkAttributeKey(name="domain_sessionid")], version=1, criteria=InterventionCriterion( attribute="travel_view:destination_page_view_count", operator=">=", value=3, ), ) sp_signals.publish([destination_assistance_intervention]) ``` ## Test the intervention The travel site automatically subscribes to interventions for the current user and session using the [Signals browser plugin](/docs/signals/interventions/subscribe/). Once you've created the intervention, test it by: 1. Navigating to the travel site 2. Viewing more than three destination pages 3. When you view the fourth page, the agent should automatically expand, rather than requiring a click, and offer assistance This proactive approach can significantly improve user experience by providing help at the right moment based on behavioral signals. --- # Learn how to build a personalized travel agent with Signals > Build personalized travel experiences using Snowplow Signals to customize content and chatbot responses based on real-time user behavior. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/intro/ Welcome to the **Build a personalized travel agent with Signals** tutorial. [![Personalized Travel Experiences](/assets/images/demo-site-2a8ce88acac39ecf2c9d90d0c34ecc9c.jpg)](/assets/files/demo-site-2a8ce88acac39ecf2c9d90d0c34ecc9c.jpg/) This tutorial walks you through building personalized experiences on a travel website using Snowplow Signals. You'll learn how to capture user behavior, define attributes that represent user preferences, and use these insights to customize both website content and chatbot responses in real-time. Personalization has become essential for creating engaging user experiences. Traditional approaches often rely on static user profiles or require manual segmentation. Snowplow Signals enables dynamic personalization by processing behavioral data in real-time and making it immediately available for customization. This tutorial is designed for developers, engineers, and analysts who want to implement real-time personalization. You'll work with a demo travel website focused on Southeast Asian destinations, creating a practical example of how behavioral data can drive personalized experiences. You'll build a complete personalization system that captures user interests through their browsing behavior, processes this data into meaningful attributes using Snowplow Signals, and applies these insights to customize both the content displayed on the website and the responses provided by an AI-powered chatbot. ## Prerequisites - Signals enabled for your CDI pipeline or [Snowplow Local](https://github.com/snowplow-incubator/snowplow-local) set up with [Signals Sandbox](https://try-signals.snowplow.io/dashboard) - The [Snowplow Inspector](https://chromewebstore.google.com/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm) browser extension installed - Familiarity with running Jupyter notebooks, either locally or in Google Colab - [Docker](https://www.docker.com/) installed and configured - Access to the Jupyter [notebook](https://colab.research.google.com/github/snowplow/documentation/blob/main/static/notebooks/signals-personalize-travel-demo.ipynb) - Optional: an [OpenAI API key](https://platform.openai.com/api-keys) to customize agent responses --- # Provide Signals personalization context to the travel AI agent > Integrate Snowplow Signals with an AI chatbot to provide personalized travel recommendations based on user behavioral data. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/personalizing-agent/ You'll now personalize a chatbot experience by integrating Signals with an AI agent that uses OpenAI as the backend. The chatbot fetches user attributes via a tool call, allowing it to provide contextual, personalized responses based on the user's browsing behavior. The demo site includes all necessary tools and SDKs, so you'll only need an OpenAI API key to implement this feature. ## How it works When a user asks the agent a question, the system fetches the attribute values for the user using the Signals API and uses these to modify the prompt sent to OpenAI. This allows personalized responses based on the user's preferences and behavior. A toggle button on the chat widget allows you to enable or disable the tool call so you can see the difference in responses with and without Signals personalization. ## Data flow ```mermaid sequenceDiagram Agent->>+OpenAI: Message Agent->>+Tool call: domain_sessionid Tool call->>+Typescript SDK: domain_sessionid Typescript SDK->>+Signals API: domain_sessionid, service_name Signals API->>+Typescript SDK: attributes Typescript SDK->>+Tool call: attributes Tool call->>+OpenAI: attributes OpenAI->>+Agent: personalized response ``` ## Configure the agent Add these variables to your `.env` file in the `snowplow-local` directory: ```text AI_MODEL_PROVIDER=openai AI_MODEL_NAME=gpt-4o-mini OPENAI_API_KEY=your_openai_api_key ``` ## Generate personalization data You'll need some attribute data to personalize the agent responses. Browse different filters, destinations, and blog pages to generate additional attribute values. Check your attribute values using the Snowplow Inspector **Attributes** tab. You should see values under the `travel_view` label. ## Test the agent Start chatting with the agent by selecting the chat icon in the lower right portion of the screen. First, test without personalization: 1. Turn the toggle switch off (gray means off, green means on) 2. Ask a question like "What are some good destinations for me in Southeast Asia?" 3. Note the generic response 4. ![Chat widget without personalization](/assets/images/chat-nops-7668c9509e3c08326232cd9b2970e845.jpg) Then test with personalization: 1. Turn the toggle switch on (green) 2. Ask the same question 3. Compare the response - it should now be tailored to your browsing behavior ![Chat widget with personalization](/assets/images/chat-ps1-144ee2bbe76757306a80409195e261ac.jpg) ![Chat widget with personalization continued](/assets/images/chat-ps2-a7995e7ad8284f4492092bc4218672ec.jpg) The personalized responses should provide destination suggestions and justifications that align with the preferences you've demonstrated through your behavior on the site. --- # Add Signals personalization options to the travel site contents > Use Snowplow Signals attributes to customize website content including images, descriptions, and layout based on user behavior. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/personalizing-site/ You'll now use your behavioral attributes to personalize the website experience in real-time. The travel website uses micro-segmentation based on behavioral data to customize various elements of the user experience. The personalization system uses five different attributes that represent micro-segments based on user behavior patterns: - `culinary_tourist`: users interested in food and culinary experiences - `cultural_explorer`: users interested in cultural experiences - `family_fun`: users interested in family friendly experiences and destinations - `modern_urbanite`: users interested in urban experiences like architecture, nightlife, and shopping - `tranquil_seeker`: users interested in relaxing experiences like beaches, spas, and nature To determine the most appropriate micro-segment for a user, look at the value for each of these attributes (they are counters) and use the highest value. ## Data flow Here's how the personalization works on the **Destinations** page: ```mermaid sequenceDiagram React Frontend->>+Signals SDK (Node): domain_sessionid Signals SDK (Node)->>Signals API: domain_sessionid, service_name Signals API->>+Signals SDK (Node): Attribute values Signals SDK (Node)->>+React Frontend: Save to LocalStorage ``` To achieve this personalization you'll use the [Typescript Signals SDK](https://github.com/snowplow-incubator/snowplow-signals-typescript-sdk) to fetch the attribute values from the Signals API (requires authentication), and store these in local storage in the browser. You can then use these values to personalize the content on the destinations page. ## Customizing the page The travel site has a **Destinations** home page, which you'll customise by: 1. Selecting images relevant to the segment 2. Modifying the description of the location based on the segment 3. Re-ordering the star ratings of different destination attributes, if relevant Provide a JSON structure that defines which properties to use for each micro-segment. For example, here are the options for Bangkok, Thailand: ```json [ { "name": "Bangkok", "country": "Thailand", "image": "https://images.unsplash.com/photo-1508009603885-50cf7c579365?q=80&w=1950&ixlib=rb-4.0.3&auto=format&fit=crop&w=800&q=80", "images_category": { "cultural_explorer": "https://images.unsplash.com/photo-1690299490301-2eb3865bee58?q=80&w=2069&auto=format&fit=crop&ixlib=rb-4.1.0&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D", "family_fun": "https://images.unsplash.com/photo-1733150632166-8d8752da4ff6?q=80&w=2232&auto=format&fit=crop&ixlib=rb-4.1.0&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D", "modern_urbanite": "https://images.unsplash.com/photo-1593103499244-6c882f0163cf?q=80&w=2070&auto=format&fit=crop&ixlib=rb-4.1.0&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D", "tranquil_seeker": "https://images.unsplash.com/photo-1591233244269-d8c4bcbbf1dd?q=80&w=987&auto=format&fit=crop&ixlib=rb-4.1.0&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D", "culinary_tourist": "https://images.unsplash.com/photo-1506781961370-37a89d6b3095?q=80&w=1674&auto=format&fit=crop&ixlib=rb-4.1.0&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" }, "description": "Vibrant capital with street food, temples, and bustling markets", "descriptions": { "cultural_explorer": "Grand palaces, ancient temples, and rich history await. Discover the soul of Thailand's capital.", "family_fun": "Kids love the tuk-tuks, canal tours, and vibrant markets. A city of endless family adventures.", "culinary_tourist": "A street food paradise with an explosion of flavours. Authentic Thai cuisine at every turn.", "tranquil_seeker": "Find peace in serene temples, tranquil gardens, and riverside long-tail boat journeys.", "modern_urbanite": "A vibrant, non-stop metropolis with world-class nightlife, shopping, and cutting-edge art." } } ] ``` For simplicity, these values are hardcoded but you could also generate them using generative AI. This results in more unique, but more randomized, results. ## Test the personalization You can test the personalization by performing different actions across the website to increase the counter for a given micro-segment. Go back to your Jupyter notebook and examine the attribute definitions to see what actions trigger each segment. Once you've performed 4 or more actions that align with a specific segment, refresh the destinations page. It should now be customized to your segment. Here's what the page looks like before: ![Before personalization](/assets/images/no-pers-d34fc246d7d40c47e762bec8a54d4b5b.jpg) After personalization for a `culinary_tourist` user, the images and descriptions have been updated to focus on food, and the food star rating has been put first: ![After personalization (culinary tourist segment)](/assets/images/with-pers-499344d057f9f424d4fd912e10e360dc.jpg) The same destination is presented differently based on the user's demonstrated interests and behaviors. --- # Run the demo Signals React travel application > Set up and run the demo travel website using Docker and Snowplow Local. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/setting-up/ You'll now install and run the example travel website that you'll use to test Snowplow Signals personalization. This website represents a typical ecommerce travel platform where users browse destinations, read content, and interact with various features. The demo website is part of the [Snowplow Local](https://github.com/snowplow-incubator/snowplow-local) repository. It's a React application. ## Clone Snowplow Local Clone the Snowplow Local repository to your machine: ```text git clone git@github.com:snowplow-incubator/snowplow-local.git ``` Change directory into the `snowplow-local` folder: ```text cd snowplow-local ``` ## Configure environment variables Create an `.env` file based on the example file: ```text cp .env.example .env ``` Edit the `.env` file with your Signals connection credentials: **CDI:** ```bash NEXT_PUBLIC_SNOWPLOW_SIGNALS_API_URL=signals.snowplow.com SNOWPLOW_SIGNALS_API_KEY= SNOWPLOW_SIGNALS_API_KEY_ID= SNOWPLOW_SIGNALS_ORGANIZATION_ID= NEXT_PUBLIC_SNOWPLOW_COLLECTOR_URL= ``` **Sandbox:** ```bash NEXT_PUBLIC_SNOWPLOW_SIGNALS_API_URL=sandbox.signals.snowplow.com SNOWPLOW_SIGNALS_INGEST_URL= SNOWPLOW_SIGNALS_TRIAL_TOKEN= ``` *** If you plan to use the AI agent add an OpenAI API key (`OPENAI_API_KEY`) to your `.env` file. ## Run the travel website Run the following Docker command: **CDI:** ```bash docker compose --profile travel-site up ``` **Sandbox:** ```bash docker compose --profile travel-site --profile signals up ``` *** ## Test the website Open the travel website in your browser at <http://localhost:8086>. You should see the homepage of the travel site. Open your browser's developer console (Ctrl+Shift+I or equivalent) and go to the Snowplow Inspector tab. Generate some events by clicking on filters on the [destinations](http://localhost:8086/destinations) page. You should see self-describing events firing into your Snowplow pipeline. Explore the site to get familiar with its features before you define the attributes in the next step. --- # Test your Signals attribute definitions using the Snowplow Inspector > Verify that your Snowplow Signals attributes are working correctly by generating events and checking attribute values. > Source: https://docs.snowplow.io/tutorials/signals-personalize-travel/testing-attributes/ You'll now test your attributes to verify they're working correctly by generating behavioral events and checking that the attribute values update as expected. You'll use the Snowplow Inspector browser extension to monitor both the events being sent to your collector and the resulting attribute values. ## Generate events Start by generating events on your travel website: 1. Go to your [demo site](http://localhost:8086) 2. Click on **Destinations** in the top navigation bar 3. Click on any destination page 4. Refresh the page once you're on the destination page ## Check attribute values Open your browser's developer console (Ctrl+Shift+I or equivalent) and navigate to the [Snowplow Inspector](https://chromewebstore.google.com/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm) tab: ![Snowplow Inspector](/assets/images/inspector-efdb4aa7960e09c637a516b8b89f1db8.jpg) 1. In the **Events** tab, verify you can see page view events being sent to your collector endpoint 2. In the **Attributes** tab, check that the `page_view_count` and `dest_page_view_count` attributes have non-zero values If you're not using the Chrome extension, you can check the developer console logs instead. Note that the logs are output when the page is refreshed, so they may lag behind the values shown in the Inspector. ## Troubleshoot If your attribute values appear as null or zero: - Check that your events are being successfully sent to your collector - If you're using event forwarding, verify your forwarding address is correct - Ensure your Signals credentials are configured properly in the `.env` file You can now move on to personalizing the website content using these attributes. --- # Conclusions and next steps from the Signals quick start tutorial > Complete the Snowplow Signals quick start tutorial, and explore next steps for real-time customer intelligence. > Source: https://docs.snowplow.io/tutorials/signals-quickstart/conclusion/ In this tutorial you've learned how to use Signals to calculate and serve information about user behavior in real time. This is the process workflow: - Create attribute group - Publish the attribute group to Signals - Create service - Retrieve calculated attribute values from the Profiles Store ## Next steps Here are some ideas for further exploration: - Try out other entities, e.g. `domain_userid`, to calculate the attributes against - Define attributes based off other event types - Retrieve calculated attributes in your real applications - Explore our other Signals tutorials: - [Implement Real-Time Interventions in an Ecommerce App Using Signals](/tutorials/signals-interventions/start/). - [Score prospects in real time using Signals and ML](/tutorials/signals-ml-prospect-scoring/intro/) - [Set up the Signals batch engine](/tutorials/signals-batch-engine/start/) for calculating attributes from your warehouse --- # Define a Signals attribute group for session metrics > Create an attribute group in Snowplow Signals to calculate session metrics like page views, browser, and referrer data in real time. > Source: https://docs.snowplow.io/tutorials/signals-quickstart/define-attribute-group/ [Attribute groups](/docs/signals/concepts/#attribute-groups) are where you define the data you want to calculate. To create an attribute group, go to **Signals** > **Attribute groups** in Snowplow Console and click **Create attribute group**. ![Create attribute group form for quickstart tutorial](/assets/images/attribute-group-create-05c9671a86bc1133c6625497bb434064.png) Follow these instructions to configure Signals to calculate three different session metrics from page views in your real-time event stream: - How many page views in the last 15 minutes for each session - The last seen browser name for each session - The first seen page referrer for each session To learn how to define attribute groups using the Python SDK, check out the [Quick Start notebook](https://colab.research.google.com/github/snowplow-incubator/signals-notebooks/blob/main/quickstart.ipynb), hosted on Google Colab. ## Configure group information Specify the basic configuration for your attribute group: - **Name**: `quickstart_group` - **Description**: Quick Start tutorial: page view session metrics - **Source**: stream - **Primary owner**: your email address The name will be the group's unique identifier. The description and owner are optional. Under the Configuration section, select the following: - **Attribute key**: `domain_sessionid` - **TTL**: leave as default ## Define attributes Click **Add attribute** to create each one. ![Add attribute button in attribute group configuration](/assets/images/attribute-group-add-attribute-9c768a8601d840dcb5dfc584e5cbbbd3.png) ### Page view counter The first attribute is a count of the number of page view events within the last 15 minutes. Enter `page_view_count` in the attribute name field. To set the event to calculate this attribute from: 1. Click on the event filter field to bring up the event selection options 2. Page views are a built-in Snowplow event, so they'll be listed within the default **Snowplow events** tab 3. Click in the search box to find `page_view` 4. Click **Confirm** to add the event to the attribute ![Event selection showing page\_view event in Snowplow events list](/assets/images/attribute1-specify-page-view-cfbc1204a7d596b7c79114f00b078d3e.png) Leave the aggregation as `Counter`. No property is used for this aggregation, so leave the property field blank. To set the time period: 1. Click on the **More** button 2. Update the time period to 15 minutes 3. Click **Done** to save ![Time period configuration set to 15 minutes for page view counter](/assets/images/attribute1-set-period-1245a531daa2acb8b7019decfda6c451.png) The purple dot next to **More** indicates that you have extended settings. ![Completed page view counter attribute with time period indicator](/assets/images/attribute1-complete-1eddcadd29839f811a771f978316bf1b.png) ### Most recent browser The second attribute is the last seen browser name. The calculation makes use of the [YAUAA enrichment](/docs/pipeline/enrichments/available-enrichments/yauaa-enrichment/): the browser name is a field in the `yauaa_context` entity. Create an attribute named `most_recent_browser`, and select `page_view` as before. Choose the `Last` aggregation. To set the property: 1. Click on the property selection field 2. Choose the **Entities** tab to search through all schemas that have been tracked as entities with your events 3. Use the search bar to search for `yauaa_context` 4. Select the entity `yauaa_context (nl.basjes)` 5. Select the `agentName` property 6. Click **Confirm** to save ![Property selection showing YAUAA context agentName field for browser attribute](/assets/images/attribute2-property-selector-d2cb6ca2f9acc7207cf1a5eaa5c35f9c.png) ### First referrer The third attribute stores the first seen referrer path, based on the `refr_urlhost` [atomic event property](/docs/fundamentals/canonical-event/#web-specific-fields). Create an attribute named `first_referrer`, and select `page_view` as before. Choose the `First` aggregation. To set the property: 1. Click on the property selection field 2. Stay on the default **Atomic** tab to search through all atomic properties 3. Use the search bar to search for and select `refr_urlhost` 4. Click **Confirm** to save ![Property selection showing refr\_urlhost atomic field for referrer attribute](/assets/images/attribute3-property-selector-cae8bd66bb4c82c99f0c4eb19a856ee8.png) For a trivial example of using criteria filters, add a filter to only consider events where the referrer is not an empty string. To set the criteria filter: 1. Click on the **More** button 2. Click **Add criteria** 3. Choose the `page_referrer` atomic property and click **Confirm** 4. Change the operator to `not equals` 5. Leave the value blank 6. Click **Done** to return to the group details page ![Criteria configuration filtering for non-empty referrer values](/assets/images/attribute3-criteria-c7f348fa7a4d62046e99bb08d56ded51.png) ## Test the attribute definitions Once you've added attributes, click **Run preview** to test your attribute group configuration. ![Attribute group with three attributes ready for preview testing](/assets/images/attribute-group-ready-to-test-5aa563c932d646a7e65d6d9cd17c665e.png) This will calculate the attributes from your atomic events table using 10 random events from the last hour. You should see something like this: ![Test results showing calculated attributes for sample session IDs](/assets/images/attribute-group-test-results-6d563dd6fffe08fc3f9d74502b2e51e6.png) The first column shows the unique attribute key values, in this case for the session attribute key `domain_sessionid`. ## Save the attribute group Once you're satisfied with the preview results, click **Create attribute group** to save it as a draft. ![Draft attribute group page after saving configuration](/assets/images/attribute-group-draft-c1d87d4475afae4157d9f077bd4b0d77.png) Click **Publish** to push this configuration to Signals and start calculating attributes. ![Published attribute group showing active calculation status](/assets/images/attribute-group-published-2cf1af60f705a14cbe0ba0d61c5390f6.png) --- # Define a Signals tutorial service > Configure a Snowplow Signals service to group attribute groups for easy consumption in applications. > Source: https://docs.snowplow.io/tutorials/signals-quickstart/define-service/ [Services](/docs/signals/concepts/#services) group attribute groups together for serving to your applications. To create a service, go to **Signals** > **Services** in Snowplow Console and click **Create service**. ![Create service form for quickstart tutorial service](/assets/images/service-create-85585d669c129ebb7cb30d74209dfc12.png) ## Configure basic service information Specify the configuration for your service: - **Name**: `quickstart_service` - **Description**: Quick Start tutorial service - **Owner**: your email address The name will be the service's unique identifier. The description and owner are optional. ## Select attribute groups When choosing which attribute groups to include, you'll select a specific version of each attribute group. To add an attribute group: 1. Click the attribute group selector 2. Search for and select the group you just created, `quickstart_group (v1)` If `quickstart_group (v1)` attribute group isn't showing up, check that it's published. ![Attribute group selector showing quickstart\_group v1 selection](/assets/images/service-choose-groups-3eb324c39c910489d866001bf3cb9cc5.png) ## Create the service Click **Create service** to save your service configuration. Services are automatically published as soon as they're created. ![Published quickstart service ready for attribute retrieval](/assets/images/service-published-5f5116562383bfe95d332ac7fd6b070c.png) Your service is ready to use for retrieving attributes from the Profiles Store. --- # Retrieve calculated attributes using the Signals Python SDK > Access and consume calculated behavioral data attributes from the Snowplow Signals Profiles Store using the Python SDK. > Source: https://docs.snowplow.io/tutorials/signals-quickstart/retrieve-attributes/ For a real use case, you'll want to consume calculated attributes in your applications. Read more about this [in the Signals documentation](/docs/signals/attributes/). For this tutorial, we've provided a [Jupyter notebook](https://colab.research.google.com/github/snowplow-incubator/signals-notebooks/blob/main/quickstart.ipynb) so you can quickly explore attribute retrieval using the Signals Python SDK. ## Finding your current session ID In your real application code, you can access the current session ID and use it to retrieve the relevant attribute values. The attributes are being calculated in real time, in session. Read about how to access IDs such as `domain_sessionid` in your web application in [the JavaScript tracker](/docs/sources/web-trackers/cookies-and-local-storage/getting-cookie-values/) documentation. To test this out, use the [Snowplow Inspector](/docs/testing/snowplow-inspector/) browser extension to find out your current session ID on your web application. Click around and generate some page view events. Then find your `Domain Session ID` in the Inspector. ![Screenshot showing the session ID in the Snowplow Inspector](/assets/images/inspector-session-8a359813742a08d72749eacb5c4727b5.png) ## Connecting to Signals Install the [Signals Python SDK](https://pypi.org/project/snowplow-signals/) into the notebook, and connect to Signals. 1. Go to **Signals** > **Overview** in Snowplow Console to find your Signals credentials 2. Add them to the notebook secrets: ![Screenshot showing how to add secrets](/assets/images/notebook-secrets-0e5fd8ee477440010bd1671d5b6e2908.png) 3. Install the SDK: ```python %pip install snowplow-signals ``` 4. Connect to Signals: ```python from snowplow_signals import Signals from google.colab import userdata sp_signals = Signals( api_url=userdata.get('SP_API_URL'), api_key=userdata.get('SP_API_KEY'), api_key_id=userdata.get('SP_API_KEY_ID'), org_id=userdata.get('SP_ORG_ID'), ) ``` ## Retrieving your session attributes Use your current session ID to retrieve the attributes that Signals has just calculated about your session. ```python response = sp_signals.get_service_attributes( name="quickstart_service", attribute_key="domain_sessionid", identifier="472f97c1-eec1-45fe-b081-3ff695c30415", # UPDATE THIS ) df=response.to_dataframe() df ``` The result should look something like this: | | `domain_sessionid` | `page_view_count` | `most_recent_browser` | `first_referrer` | | - | -------------------------------------- | ----------------- | --------------------- | ---------------- | | 0 | `472f97c1-eec1-45fe-b081-3ff695c30415` | 2.0 | `Firefox` | `snowplow.io` | ### Retrieving single attributes To retrieve individual attributes rather than using a service, use the `get_group_attributes()` method. ```python response = sp_signals.get_group_attributes( name="quickstart_group", version=1, attributes=["page_view_count"], attribute_key="domain_sessionid", identifiers=["472f97c1-eec1-45fe-b081-3ff695c30415"] ) df=response.to_dataframe() df ``` --- # Learn how to set up Signals for real-time calculation > Get started with Snowplow Signals to calculate user behavior attributes in real time, to build personalization use cases. > Source: https://docs.snowplow.io/tutorials/signals-quickstart/start/ Welcome to the [Snowplow Signals](/docs/signals/introduction/) Quick Start tutorial. Snowplow Signals is a real-time personalization engine for customer intelligence, built on Snowplow's behavioral data pipeline. It allows you to compute, access, and act on in-session stream and historical user data, in real time. This guide will walk you through the steps to calculate user behavior attributes from your Snowplow event stream, and to retrieve them for use in your application. This will unlock real-time personalization use cases for your business. It should take less than 10 minutes from starting to retrieving calculated attributes. This tutorial shows how to define attributes using the Snowplow Console UI, as well as programmatically using the [Signals Python SDK](https://pypi.org/project/snowplow-signals/). ## Prerequisites This tutorial assumes that you have: - Snowplow page view tracking on a web application - Snowflake warehouse - A Signals connection ## Connecting to Signals Log in to [Console](https://console.snowplowanalytics.com) and navigate to the **Signals** section. You'll need to [set up a Signals connection](/docs/signals/connection/) if you don't have one yet. --- # Create and publish a data structure using the Snowplow CLI MCP tool > Step-by-step walkthrough of creating, validating, and publishing a data structure and tracking plan using the Snowplow CLI MCP tool. > Source: https://docs.snowplow.io/tutorials/snowplow-cli-mcp/basic-workflow/ Here's a typical interaction pattern for creating a data structure. ## 1. Get context > **Note:** Always ensure `get_context` is called at the start of your conversation. If you don't see it happen, then ask for it. ```text Please call get_context before we start working. ``` The assistant will retrieve the built-in schema and rules that define how Snowplow components should be structured. This provides the structural templates and requirements for your tracking implementation. ## 2. Create a data structure ```text Create a data structure for tracking when users view a product page. Include properties for product ID, product name, category, and price. ``` The assistant will: - Generate a proper UUID for the data structure - Create a valid event schema following Snowplow conventions - Save the file locally to the appropriate location **Note**: Files are created locally only. Use `snowplow-cli data-structures publish` to sync to Console when ready. ## 3. Validate (automatic) The assistant should automatically call `validate_data_structures` on the created file and report any validation issues. ## 4. Iterate if needed ```text The price should be optional, not required. Also add a description field. ``` The assistant will modify the structure and re-validate. ## 5. Tracking plan creation ```text Create a tracking plan for ecommerce product interactions. Include: - The existing product page views - Add to cart events - A source application for our website - Proper validation of all components ``` The assistant will: 1. Create the necessary data structures for events (locally) 2. Create a source application definition (locally) 3. Create a tracking plan linking everything together (locally) 4. Validate all components together (including cross-references) ## 6. Publish to Console Use `snowplow-cli data-structures publish` and `snowplow-cli data-products publish` to push changes to Console. --- # Conclusion > Summary of what you learned in the Snowplow CLI MCP tool tutorial. > Source: https://docs.snowplow.io/tutorials/snowplow-cli-mcp/conclusion/ You've successfully set up the Snowplow CLI MCP tool and learned how to use AI assistants to manage your Snowplow data structures and tracking plans through natural conversation. In this tutorial, you: - Installed and configured the Snowplow CLI with an MCP-compatible AI assistant - Learned how to create and validate data structures using natural language prompts - Built complete tracking plans including events, entities, and source applications - Published your locally created structures to Snowplow Console ## Next steps - Read the [Snowplow CLI documentation](/docs/event-studio/programmatic-management/snowplow-cli/) for further workflows and commands - Follow the [Snowplow CLI Git tutorial](/tutorials/data-structures-in-git/introduction/) to manage your tracking plans with Git version control - Check out our [tracking design best practise guide](/docs/fundamentals/tracking-design-best-practice/) --- # Learn how to set up the Snowplow CLI MCP tool > Introduction to the Snowplow CLI MCP tool for managing data structures and tracking plans through AI assistants like Claude, Cursor, or Copilot. > Source: https://docs.snowplow.io/tutorials/snowplow-cli-mcp/introduction/ The Snowplow CLI MCP (Model Context Protocol) tool integrates Snowplow's data structure management capabilities directly into AI assistants like Claude. This enables natural language interaction for creating, validating, and managing your Snowplow tracking plans **locally**. **Important**: the MCP tool creates and validates files on your local filesystem only. To sync changes to Console, you'll use the regular CLI commands like `snowplow-cli ds publish` afterward. ## What you'll learn - How to set up the Snowplow CLI MCP tool with AI assistants like Claude Desktop, Cursor, or Copilot - Available MCP tools and their functions - Creating and validating data structures through conversation - AI-powered analysis for strategic tracking plan development ## Demo: Using the Snowplow CLI MCP with Claude Desktop ## Prerequisites - Snowplow CLI installed ([installation guide](/docs/event-studio/programmatic-management/snowplow-cli/#install)) - Snowplow CLI configured with your Console credentials ([configuration guide](/docs/event-studio/programmatic-management/snowplow-cli/#configure)) - Claude Desktop or another MCP-compatible client (Cursor or Copilot) - **Filesystem access**: if using Claude Desktop, you must run alongside an MCP filesystem server (e.g., `@modelcontextprotocol/server-filesystem`) to enable file operations. Other MCP clients (Cursor, Copilot, etc.) have filesystem access by default. ## Available MCP tools The Snowplow CLI MCP server provides these tools: ### Core tools - **`get_context`** - Retrieves the built-in schema and rules that define how Snowplow data structures, tracking plans, and source applications should be structured. - **`get_uuid`** - Generates valid v4 UUIDs required by many Snowplow components. ### Validation tools - **`validate_data_structures`** - Validates data structure files (events/entities). Must be called after creating or modifying any data structure. - **`validate_data_products`** - Validates tracking plans and source applications. Must include both tracking plan files AND their referenced source application files. --- # Next steps from the Snowplow CLI MCP tool tutorial > Advanced use cases for the Snowplow CLI MCP tool, including business process analysis, schema evolution planning, and cross-platform tracking. > Source: https://docs.snowplow.io/tutorials/snowplow-cli-mcp/next-steps/ Beyond creating individual files, AI assistants can help analyze tracking requirements and suggest comprehensive solutions: ### Business process analysis ```text We want to track user engagement on our blog ``` The assistant will suggest: - Page view events with article entity - Custom events for shares and subscriptions - Reading progression metrics - Time-based engagement tracking ### Schema evolution planning ```text Our product_viewed event is missing context about where users found the product ``` The assistant will provide: - Recommendations for adding entities - Versioning strategy - Guidelines for maintaining consistency with related events ### Journey mapping ```text Create a tracking plan for our checkout funnel ``` The assistant will create: - Complete tracking plan covering cart management through payment - Error state tracking - Abandonment scenario tracking ### Cross-platform consistency ```text We're adding mobile app tracking to our existing web tracking ``` The assistant will analyze: - Which entities should be consistent across platforms - Platform-specific contexts needed - Tracking plan structure recommendations --- # Install the Snowplow CLI and configure an MCP client > How to install the Snowplow CLI and configure MCP clients such as Claude Desktop, VS Code, and Cursor. > Source: https://docs.snowplow.io/tutorials/snowplow-cli-mcp/setup/ To use the MCP tools with your AI assistant, you'll need to install the Snowplow CLI and configure your chosen MCP client. This guide walks you through both steps. ## 1. Install Snowplow CLI ```bash # Using Homebrew brew install snowplow/taps/snowplow-cli # Or download binary directly curl -L -o snowplow-cli https://github.com/snowplow/snowplow-cli/releases/latest/download/snowplow-cli_linux_x86_64 chmod u+x snowplow-cli ``` If you have `node.js` set up then no need to install, you can run via `npx`. ## 2. Configure your MCP client The Snowplow CLI MCP server can be used with any MCP-compatible client. Below are configuration examples for popular clients. For a complete list of supported clients and their configurations, see the [MCP reference](/docs/event-studio/programmatic-management/snowplow-cli/reference/#mcp). ### Claude desktop **Config location**: - **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json` - **Windows**: `%APPDATA%\Claude\claude_desktop_config.json` ```json { "mcpServers": { "snowplow-cli": { "command": "snowplow-cli", "args": ["mcp"] } } } ``` ### VS code Add to `.vscode/mcp.json` in your workspace: ```json { "servers": { "snowplow-cli": { "type": "stdio", "command": "snowplow-cli", "args": ["mcp"] } } } ``` ### Cursor Add to `.cursor/mcp.json` in your workspace: ```json { "mcpServers": { "snowplow-cli": { "command": "snowplow-cli", "args": ["mcp", "--base-directory", "."] } } } ``` ### Using with npx If using via `npx`, use this configuration format instead: ```json { "mcpServers": { "snowplow-cli": { "command": "npx", "args": ["-y", "@snowplow/snowplow-cli", "mcp"] } } } ``` For VS Code, adjust the `type` field accordingly: ```json { "servers": { "snowplow-cli": { "type": "stdio", "command": "npx", "args": ["-y", "@snowplow/snowplow-cli", "mcp"] } } } ``` ## 3. Start using MCP After configuring your chosen client, start a new conversation or session. The Snowplow CLI MCP tools should be available for use. --- # Learn how to set up the Unified Digital dbt package > Learn how to set up the Snowplow Unified Digital dbt package for web and mobile analytics. Covers prerequisites for implementing unified cross-platform tracking with DBT transformations. > Source: https://docs.snowplow.io/tutorials/unified-digital/intro/ This tutorial walks you through the process of setting up our Unified Digital dbt package. ## Prerequisites - [DBT](https://github.com/dbt-labs/dbt) installed - Connection to a warehouse - For web: - web events dataset being available in your database - [Snowplow Javascript tracker](/docs/sources/web-trackers/) version 2 or later implemented. - Web Page context [enabled](/docs/sources/web-trackers/tracker-setup/initialization-options/) (enabled by default in version 3+). - [Page view events](/docs/sources/web-trackers/tracking-events/page-views/) implemented. - For mobile: - Mobile events dataset being available in your database - Snowplow Android, iOS [mobile tracker](/docs/sources/mobile-trackers/) version 1.1.0 (or later) or [React Native tracker](/docs/sources/react-native-tracker/) implemented - Mobile session context enabled - Screen view events enabled and tracked --- # Set up the Unified Digital dbt package locally > Install and configure the Snowplow Unified Digital dbt package locally with detailed variable configuration for web and mobile data modeling. Includes optimization tips for partition pruning, entity enablement, and warehouse-specific setup. > Source: https://docs.snowplow.io/tutorials/unified-digital/setting-up-locally/ Installing the package: 1. Run the following command in a new directory to create a new dbt project ```bash dbt init ``` 2. Set up a profile to connect to the warehouse where you have your Snowplow events. ```yaml profile: 'demo_project' ``` 3. Create a new `packages.yml` file. 4. Paste the latest version of the Snowplow Unified package. You could also use a range here if you want to keep up to date with more recent versions or you can you can hard pin to a specific version. ```bash packages: - package: snowplow/snowplow_unified version: [">=0.4.0"] ``` 5. Type the following to install the package into this project: ```bash dbt deps ``` 6. In your `project.yml` copy across our dispatch piece so that you're using our macros over the dbt core ones. We do this because we have a slightly more optimized upsert which saves you time and cost on your on your cloud warehouse bill. ```bash dispatch: - macro_namespace: dbt search_order: ['snowplow_utils', 'dbt'] ``` 7. Delete some of the other default pieces that are in the default project as they are not needed. ![](/assets/images/Screenshot_2024-07-04_at_17.14.37-9c416c5455c4fe0fe89d583b736a312e.png) ## Setting Variables Now we’ll get to using our variables, which is how you enable the parts of the model that are relevant to your use-case. 1. Define the location of your source data within your `vars` block where your raw events are being loaded into. Make sure to update these with your actual table names! ```yaml vars: snowplow_unified: snowplow__atomic_schema: schema_with_snowplow_events snowplow__database: database_with_snowplow_events ``` > **Info:** Please note that your `target.database` is NULL if using Databricks. In Databricks, schemas and databases are used interchangeably and in the dbt implementation of Databricks therefore we always use the schema value, so adjust your `snowplow__atomic_schema` value if you need to. > > Add the following variable to your dbt project's `dbt_project.yml` file > > ```yml > vars: > snowplow_unified: > snowplow__databricks_catalog: 'hive_metastore' > ``` > > Depending on the use case it should either be the catalog (for Unity Catalog users from databricks connector 1.1.1 onwards, defaulted to 'hive\_metastore') or the same value as your `snowplow__atomic_schema` (unless changed it should be 'atomic'). This is needed to handle the database property within `models/base/src_base.yml`. > > A more detailed explanation for how to set up your Databricks configuration properly can be found in [Unity Catalog support](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/#unity-catalog-support). 2. Unified Digital assumes you are modeling both web and mobile events and expects certain fields to exist based on this. If you are only tracking and modeling e.g. web data, you can disable the other as below: ```yml vars: snowplow_unified: snowplow__enable_mobile: false snowplow__enable_web: true ``` 3. Enable [entities](/docs/fundamentals/entities/) to make sure that they're processed within the package - this means they will be un-nested from the atomic columns and made available in the derived tables. Make sure to only enable the ones you need. ```yaml vars: snowplow_unified: snowplow__enable_iab: true snowplow__enable_ua: true snowplow__enable_yauaa: true snowplow__enable_browser_context: true snowplow__enable_mobile_context: true snowplow__enable_geolocation_context: true snowplow__enable_application_context: true snowplow__enable_screen_context: true snowplow__enable_deep_link_context: true snowplow__enable_consent: true snowplow__enable_cwv: true snowplow__enable_app_errors: true snowplow__enable_screen_summary_context: true ``` 4. Set up some initial conditions for our app. So we need to go in here and we need to just set the start date to be whenever you've started tracking your data from: ```yaml vars: snowplow_unified: snowplow__start_date: 'yyyy-mm-dd' ``` 5. Optimize your data processing There are ways how you can deal with [high volume optimizations](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-custom-models/high-volume-optimizations/) at a later stage, if needed, but you can do a lot upfront by selecting carefully which variable to use for `snowplow__session_timestamp`, which helps identify the timestamp column used for sessionization. This timestamp column should ideally be set to the column your event table is partitioned on. It is defaulted to `collector_tstamp` but depending on your loader it can be the `load_tstamp` as the sensible value to use: ```yml vars: snowplow_unified: snowplow__session_timestamp: 'load_tstamp' ``` > **Info:** Verify which column your events table is partitioned on. It will likely be partitioned on `collector_tstamp` or `derived_tstamp`. If it is partitioned on `collector_tstamp` you should set `snowplow__derived_tstamp_partitioned` to `false`. This will ensure only the `collector_tstamp` column is used for partition pruning when querying the events table: > > ```yml > vars: > snowplow_unified: > snowplow__derived_tstamp_partitioned: false > ``` 6. Configure more vars as necessary but theoretically this is all you need just to get started. If you are unsure whether the default values set are good enough in your case or you would already like to maximize the potential of your models, you can dive deeper into the meaning behind our variables on our [Config](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/unified/) page. It includes a [Config Generator](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/unified/#config-generator) to help you create all your variable configurations, if necessary. 7. Filter your data set You can specify both `start_date` at which to start processing events and the `app_id`'s to filter for. By default the `start_date` is set to `2020-01-01` and all `app_id`'s are selected. To change this please add the following to your `dbt_project.yml` file: ```yml vars: snowplow_unified: snowplow__start_date: 'yyyy-mm-dd' snowplow__app_id: ['my_app_1','my_app_2'] ``` Below we list a few more that might be of interest depending on your setup or modelling needs: - Enable extras The package comes with additional modules and functionality that you can enable, for more information see the [consent tracking](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/consent-module/), [conversions](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/), and [core web vitals](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/core-web-vitals-module/) documentation. - adjust page ping variables, if needed The Unified Digital Model processes page ping events to calculate web page engagement times. If your [tracker configuration](/docs/sources/web-trackers/tracking-events/activity-page-pings/) for `min_visit_length` (default 5) and `heartbeat` (default 10) differs from the defaults provided in this package, you can override by adding to your `dbt_project.yml`: ```yml vars: snowplow_unified: snowplow__min_visit_length: 5 # Default value snowplow__heartbeat: 10 # Default value ``` ## Adding the `selectors.yml` file Within the packages we have provided a suite of suggested selectors to run and test the models within the package together with the Unified Digital Model. This leverages dbt's [selector flag](https://docs.getdbt.com/reference/node-selection/syntax). You can find out more about each selector in the [Model selection](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/model-selection/) documentation. These are defined in the `selectors.yml` file ([source](https://github.com/snowplow/dbt-snowplow-unified/blob/main/selectors.yml)) within the package, however in order to use these selections you will need to copy this file into your own dbt project directory. This is a top-level file and therefore should sit alongside your `dbt_project.yml` file. If you are using multiple packages in your project you will need to combine the contents of these into a single file. ## Running the package Run dbt\_seed to make sure you see some data, so we have some seeds in our packages, run that in there, and then run the actual model. ```yaml dbt seed --select snowplow_unified --full-refresh dbt run --selector snowplow_unified ``` --- # Run the Unified Digital dbt package using Snowplow Console > Deploy the Unified Digital dbt package through Snowplow Console's fully managed service. Configure web and mobile data processing, enrichments, and scheduling through the Console interface. > Source: https://docs.snowplow.io/tutorials/unified-digital/setting-up-via-console/ Once you are happy with how your local configuration is working, you will want to schedule it to run regularly. Snowplow provides a fully managed service for running data models securely. To get started, follow these steps: 1. Commit the local configuration you built previously to a Git repository. 2. Create a Git connection to your repository. In Console, navigate to **Destinations > Connections > Set up connection > Git connection** 3. Create a warehouse connection. In Console, navigate to **Destinations > Connections > Set up connection > Data modeling connection** 4. Create a model project. In Console, select **Modeling > Model projects > Create model project** 5. Add a run configuration with a schedule to your model project. Click **Add run configuration** and **Add schedule**. Pick your preferred schedule, e.g. daily at 00:00 AM. For a full reference on running data models via Snowplow Console, see [Run data models from Console](/docs/modeling-your-data/running-data-models-via-console/). --- # View the Unified Digital dbt package output data > Query and analyze the Unified Digital dbt package output tables including views, sessions, and users. Explore aggregated metrics, engagement time calculations, and user interaction data. > Source: https://docs.snowplow.io/tutorials/unified-digital/viewing-output-data/ The output data can now be reviewed, focusing on the `views`, `sessions`, and `users` tables. These tables contain detailed records of key metrics and interactions. ```yaml select * from dbt_ryan_derived.snowplow_unified_views; ``` This query retrieves one row per view, detailing when each view started and ended, along with extensive information on where the event occurred. ```yaml select * from dbt_ryan_derived.snowplow_unified_sessions; ``` The sessions table includes the engagement time in seconds, calculated using pings in comparison to absolute time. It also contains additional data, such as counts of page views and events. ```yaml select * from dbt_ryan_derived.snowplow_unified_users; ``` The users table provides one row per user identifier, offering detailed information about each user.

elements expose: { when: "element" }, // Fires when element becomes visible, once per element details: { child_text: { title: "h2" } } // Captures the main section header } }); ``` **Browser (npm):** ```javascript import { SnowplowElementTrackingPlugin, startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: "section", // Matches all

elements expose: { when: "element" }, // Fires when element becomes visible, once per element details: { child_text: { title: "h2" } } // Captures the main section header } }); ``` *** In this example, the page has several sections. As a user scrolls down the page and each section becomes visible, an `expose_element` event is generated for each one. All events will have `"element_name": "section"`. Example `element` entity for the first section's `expose_element` event. The section title is "Why Data Teams Choose Snowplow": ```json { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "section", "width": 1920, "height": 1111.7333984375, "position_x": 0, "position_y": 716.4500122070312, "doc_position_x": 0, "doc_position_y": 716.4500122070312, "element_index": 2, "element_matches": 10, "originating_page_view": "06dbb0a2-9acf-4ae4-9562-1469b6d12c5d", "attributes": [ { "source": "child_text", "attribute": "title", "value": "Why Data Teams Choose Snowplow" } ] } } ``` Example `element` entity for the second section's `expose_element` event. The section title is "How Does Snowplow Work?": ```json { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "section", "width": 1920, "height": 2880, "position_x": 0, "position_y": 896.683349609375, "doc_position_x": 0, "doc_position_y": 1828.183349609375, "element_index": 3, "element_matches": 10, "originating_page_view": "06dbb0a2-9acf-4ae4-9562-1469b6d12c5d", "attributes": [ { "source": "child_text", "attribute": "title", "value": "How Does Snowplow Work?" } ] } } ``` ## Stop element tracking To turn off tracking, use `endElementTracking`. You can remove all configured rules, or selectively remove specific rules. **JavaScript (tag):** ```javascript // Remove all configured rules and listeners snowplow('endElementTracking'); // Removes based on `name` matching // Multiple rules may share a name snowplow('endElementTracking', { elements: ['name1', 'name2'] }); // Removes rules based on `id` matching // At most one rule can have the same `id` snowplow('endElementTracking', { elementIds: ['id1'] }); // More complicated matching // Rules where the `filter` function returns true will be removed snowplow('endElementTracking', { filter: (rule) => /recommendations/i.test(rule.name) }); // Passing an empty object removes no rules snowplow('endElementTracking', {}); ``` **Browser (npm):** ```javascript import { SnowplowElementTrackingPlugin, endElementTracking } from '@snowplow/browser-plugin-element-tracking'; // Remove all configured rules and listeners endElementTracking(); // Removes based on `name` matching // Multiple rules may share a name endElementTracking({ elements: ['name1', 'name2'] }); // Removes rules based on `id` matching // At most one rule can have the same `id` endElementTracking({ elementIds: ['id1'] }); // More complicated matching // Rules where the `filter` function returns true will be removed endElementTracking({ filter: (rule) => /recommendations/i.test(rule.name) }); // Passing an empty object removes no rules endElementTracking({}); ``` *** If you specify more than one of the `elementIds`, `elements`, and `filter` options, they get evaluated in that order. ## Configure entities You can configure additional element tracking or custom entities by modifying the `startElementTracking` call. Additional entities can be attached depending on configuration: - `element_statistics`: visibility and scroll depth statistics for the element - `element_content`: information about nested elements within the matched element - `component_parents`: the component hierarchy that the element belongs to - Custom entities Check out the [page element tracking overview](/docs/events/ootb-data/page-elements/#page-element-visibility-and-lifecycle) page to see the schema details. The configuration is per-rule, so different rules can have different settings. ### Element statistics Use the `includeStats` option to attach the `element_statistics` entity to specified events, including those not generated by this plugin. This example will add the `element_statistics` entity to `expose_element` and `page_ping` events: **JavaScript (tag):** ```javascript snowplow('startElementTracking', { elements: { selector: 'main.article', name: 'article_content', includeStats: ['expose_element', 'page_ping'] } }); ``` **Browser (npm):** ```javascript import { SnowplowElementTrackingPlugin, startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: 'main.article', name: 'article_content', includeStats: ['expose_element', 'page_ping'] } }); ``` *** Adding element statistics to page pings can be useful to understand how a user moves through the content. It'll show scroll depth increasing over time, backtracking behavior, and total engagement duration. For [baked-in events](/docs/fundamentals/events/#baked-in-events), use the following names: - Page view: `page_view` - Page ping: `page_ping` - Structured: `event` Be cautious with the `selector`. If it matches a lot of elements, this can enlarge event payload sizes. ### Element content Add the `element_content` entity by setting `contents`. It captures data about specified nested elements within the matched parent element. In this example, the plugin will track an `expose_element` event when a `.product-grid` element scrolls into view. This event will have an `element` entity for the grid itself, and multiple `element_content` entities for each `.product-card` within the grid. **JavaScript (tag):** ```javascript snowplow('startElementTracking', { elements: { selector: '.product-grid', name: 'product_list', expose: { when: 'element' }, contents: [ { selector: '.product-card', name: 'product_item', details: [ { dataset: ['productId', 'price'] }, { child_text: { name: 'h3', brand: '.brand-name' } } ] } ] } }); ``` **Browser (npm):** ```javascript import { SnowplowElementTrackingPlugin, startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: '.product-grid', name: 'product_list', expose: { when: 'element' }, contents: [ { selector: '.product-card', name: 'product_item', details: [ { dataset: ['productId', 'price'] }, { child_text: { name: 'h3', brand: '.brand-name' } } ] } ] } }); ``` *** The `details` configuration sets which element `attributes` to capture. **Example entities for this configuration** The `expose_element` event will have `"element_name": "product_list"`. One `element` entity: ```json { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "product_list", "element_index": 1, "element_matches": 1, "width": 1200, "height": 400, "attributes": [] } } ``` Multiple `element_content` entities: ```json { "schema": "iglu:com.snowplowanalytics.snowplow/element_content/jsonschema/1-0-0", "data": { "element_name": "product_item", "parent_name": "product_list", // element_name of parent element "parent_position": 1, // element_index of parent element "position": 1, "attributes": [ { "source": "dataset", "attribute": "productId", "value": "SKU-001" }, { "source": "dataset", "attribute": "price", "value": "29.99" }, { "source": "child_text", "attribute": "name", "value": "Wireless Mouse" }, { "source": "child_text", "attribute": "brand", "value": "Logitech" } ] } } ``` ```json { "schema": "iglu:com.snowplowanalytics.snowplow/element_content/jsonschema/1-0-0", "data": { "element_name": "product_item", "parent_name": "product_list", // element_name of parent element "parent_position": 1, // element_index of parent element "position": 2, "attributes": [ { "source": "dataset", "attribute": "productId", "value": "SKU-002" }, { "source": "dataset", "attribute": "price", "value": "49.99" }, { "source": "child_text", "attribute": "name", "value": "Mechanical Keyboard" }, { "source": "child_text", "attribute": "brand", "value": "Keychron" } ] } } ``` ### Component parents You can mark elements as components to track hierarchy, using `component` rules. Events for child elements include a `component_parents` entity listing their ancestor components. This is useful when you have the same appearing in multiple places on your site. Without component tracking, all those events look identical. **JavaScript (tag):** ```javascript snowplow('startElementTracking', { elements: [ // Define components (containers) { selector: 'header', // Mark the header as a component name: 'site_header', component: true, expose: false // Don't track expose events for the component itself }, { selector: 'footer', // Mark the footer as a component name: 'site_footer', component: true, expose: false // Don't track expose events for the component itself }, // Track elements - events will include component_parents { selector: '.newsletter-form', name: 'newsletter_signup', create: true, // Fire create_element events expose: { when: 'element' } // Fire expose_element events } ] }); ``` **Browser (npm):** ```javascript import { SnowplowElementTrackingPlugin, startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: [ // Define components (containers) { selector: 'header', // Mark the header as a component name: 'site_header', component: true, expose: false // Don't track expose events for the component itself }, { selector: 'footer', // Mark the footer as a component name: 'site_footer', component: true, expose: false // Don't track expose events for the component itself }, // Track elements - events will include component_parents { selector: '.newsletter-form', name: 'newsletter_signup', create: true, // Fire create_element events expose: { when: 'element' } // Fire expose_element events } ] }); ``` *** For this example, imagine a page has two `.newsletter-form` elements: one in the page sidebar, and one in the footer. The `component_parents` entity for the sidebar form, which isn't within either of the defined component containers, could look like this: ```json { "schema": "iglu:com.snowplowanalytics.snowplow/component_parents/jsonschema/1-0-0", "data": { "element_name": "newsletter_signup", "component_list": [] } } ``` The `component_parents` entity for the footer form, which is within the `footer` component, could look like this: ```json { "schema": "iglu:com.snowplowanalytics.snowplow/component_parents/jsonschema/1-0-0", "data": { "element_name": "newsletter_signup", "component_list": ["site_footer"] } } ``` #### Generate entities for other events The plugin also exposes a `getComponentListGenerator` utility function for attaching component hierarchy information to custom events, or to events generated by other plugins like the [form](/docs/sources/web-trackers/tracking-events/form-tracking/) or [link](/docs/sources/web-trackers/tracking-events/link-click/) tracking plugins. This function returns two entity generator functions that determine component hierarchy for a given element: - `componentGenerator`: returns a single `component_parents` entity - `componentGeneratorWithDetail`: returns a `component_parents` entity plus an `element` entity **JavaScript (tag):** The JavaScript tracker uses a callback pattern to access the generators asynchronously: ```javascript // This snippet assumes you've already defined component rules in startElementTracking snowplow('getComponentListGenerator', function (componentGenerator, componentGeneratorWithDetail) { // attach the component_parents entity to events from these plugins snowplow('enableLinkClickTracking', { context: [componentGenerator] }); snowplow('enableFormTracking', { context: [componentGenerator] }); }); ``` **Browser (npm):** The Browser tracker returns the generators directly as an array: ```javascript // This snippet assumes you've already defined component rules in startElementTracking import { getComponentListGenerator } from '@snowplow/browser-plugin-element-tracking'; import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; const [componentGenerator, componentGeneratorWithDetail] = getComponentListGenerator(); // attach the component_parents entity to events from these plugins enableLinkClickTracking({ context: [componentGenerator] }); enableFormTracking({ context: [componentGenerator] }); ``` *** > **Note:** `componentGeneratorWithDetail` returns multiple entities and isn't directly compatible with the `context` arrays used by the link and form tracking plugins. ### Custom entities There are two ways you can add custom entities to element tracking events: - Plugin `context` option, applies to all rules - Per-rule `context` option, applies only to events from that specific rule **JavaScript (tag):** ```javascript // Configure at plugin level to apply to all rules snowplow('startElementTracking', { elements: [/* rules */], context: [/* entities attached to ALL events */] }); // Configure at rule level to apply to a specific rule snowplow('startElementTracking', { elements: { selector: '.promo-banner', name: 'promotion', context: [/* entities attached only to this rule's events */] } }); ``` **Browser (npm):** ```javascript // Configure at plugin level to apply to all rules startElementTracking({ elements: [/* rules */], context: [/* entities attached to ALL events */] }); // Configure at rule level to apply to a specific rule startElementTracking({ elements: { selector: '.promo-banner', name: 'promotion', context: [/* entities attached only to this rule's events */] } }); ``` *** You can configure static or dynamic entities: - Use static entities when the same data should be attached to every event, e.g. A/B test variant ```javascript context: [ { schema: 'iglu:com.example/campaign/jsonschema/1-0-0', data: { campaign_id: 'summer_2025', variant: 'A' } } ] ``` - Use callbacks to generate dynamic entities when the data depends on the specific element that triggered the event ```javascript context: [ (element, rule) => ({ schema: 'iglu:com.example/promotion/jsonschema/1-0-0', data: { promo_id: element.dataset.promoId, position: element.dataset.position, rule_name: rule.name } }) ] ``` ## Configure the plugin As well as configuring the `element_statistics`, `element_content`, and `component_parents` entities, you can customize how element visibility tracking works using the options below. The core options are explained in this table: | Property | Type | Description | Status | | ---------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | | `selector` | `string` | A CSS selector string that matches one or more elements on the page that should trigger events from this rule. | **Required** | | `name` | `string` | A label to name this rule. Allows you to keep a stable name for events generated by this rule, even if the `selector` changes, so the data produced remains consistent. You can share a single `name` between many rules to have different configurations for different selectors. If not supplied, the `selector` value becomes the `name`. | _Recommended_ | | `id` | `string` | A specific identifier for this rule. Useful if you share a `name` between many rules and need to specifically remove individual rules within that group. | | You'll see `selector` and `name` in the examples on this page. ### Event frequency with `when` The `when` option controls how often events fire. The default is `always`. This example shows the options: **JavaScript (tag):** ```javascript // "always" - fires every time (e.g., every scroll in/out of view) // Boolean shorthand - expose: true snowplow('startElementTracking', { elements: { selector: '.ad-banner', name: 'ad_impression', expose: { when: 'always' } // fires each time banner scrolls into view } }); // "element" - fires once per matched element snowplow('startElementTracking', { elements: { selector: '.product-card', name: 'product_impression', expose: { when: 'element' } // fires once per card, even if user scrolls back } }); // "pageview" - fires once per element, resets on new page view (useful for SPAs) snowplow('startElementTracking', { elements: { selector: '.hero-section', name: 'hero_viewed', expose: { when: 'pageview' } // resets when tracker fires next page_view event } }); // "once" - fires exactly once for the entire rule, regardless of how many elements match snowplow('startElementTracking', { elements: { selector: '.newsletter-form', name: 'newsletter_form_exists', expose: { when: 'once' } // fires once even if multiple forms exist } }); // "never": never track this event for this rule // This is useful for defining components // Boolean shorthand - expose: false snowplow('startElementTracking', { elements: { selector: 'section', expose: { when: 'never' } // never fires } }); ``` **Browser (npm):** ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; // "always" - fires every time (e.g., every scroll in/out of view) // Boolean shorthand - expose: true startElementTracking({ elements: { selector: '.ad-banner', name: 'ad_impression', expose: { when: 'always' } // fires each time banner scrolls into view } }); // "element" - fires once per matched element startElementTracking({ elements: { selector: '.product-card', name: 'product_impression', expose: { when: 'element' } // fires once per card, even if user scrolls back } }); // "pageview" - fires once per element, resets on new page view (useful for SPAs) startElementTracking({ elements: { selector: '.hero-section', name: 'hero_viewed', expose: { when: 'pageview' } // resets when tracker fires next page_view event } }); // "once" - fires exactly once for the entire rule, regardless of how many elements match startElementTracking({ elements: { selector: '.newsletter-form', name: 'newsletter_form_exists', expose: { when: 'once' } // fires once even if multiple forms exist } }); // "never": never track this event for this rule // This is useful for defining components // Boolean shorthand - expose: false startElementTracking({ elements: { selector: 'section', expose: { when: 'never' } // never fires } }); ``` *** If you're using `when: pageview`, ensure that the tracker is firing page view events appropriately for your needs, especially if it's a single page application (SPA). The plugin assumes that you'll call `startElementTracking()` before `trackPageView()`. The first page view doesn't reset the element visibility state, because the plugin sets `ignoreNextPageView: true` by default internally. If your site tracks page views before calling `startElementTracking()`, you can disable this behavior by passing `ignoreNextPageView: false` in the plugin options when adding it to the tracker. **JavaScript (tag):** ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-element-tracking@latest/dist/index.umd.min.js", ["snowplowElementTracking", "SnowplowElementTrackingPlugin"], [{ ignoreNextPageView: false }] ); snowplow('startElementTracking', { elements: [/* configuration */] }); ``` **Browser (npm):** First, add the plugin when initializing the tracker. ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { SnowplowElementTrackingPlugin, startElementTracking } from '@snowplow/browser-plugin-element-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ SnowplowElementTrackingPlugin({ ignoreNextPageView: false }) ], }); startElementTracking({ elements: [/* configuration */] }); ``` *** ### Visibility thresholds for `expose` Control what counts as "visible" for `expose_element` events: **JavaScript (tag):** ```javascript snowplow('startElementTracking', { elements: { selector: '.video-player', name: 'video_impression', expose: { when: 'element', minPercentage: 0.5, // at least 50% of element must be visible minTimeMillis: 2000, // must be visible for 2 seconds cumulative minSize: 10000, // element must be at least 10,000px² (e.g., 100x100) boundaryPixels: 50 // adds 50px padding when calculating visibility } } }); // boundaryPixels accepts different formats: expose: { when: 'element', boundaryPixels: 20 // 20px all sides // or: [10, 20] // 10px vertical, 20px horizontal // or: [10, 20, 30, 40] // top, right, bottom, left } ``` **Browser (npm):** ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: '.video-player', name: 'video_impression', expose: { when: 'element', minPercentage: 0.5, // at least 50% of element must be visible minTimeMillis: 2000, // must be visible for 2 seconds cumulative minSize: 10000, // element must be at least 10,000px² (e.g., 100x100) boundaryPixels: 50 // adds 50px padding when calculating visibility } } }); // boundaryPixels accepts different formats: expose: { when: 'element', boundaryPixels: 20 // 20px all sides // or: [10, 20] // 10px vertical, 20px horizontal // or: [10, 20, 30, 40] // top, right, bottom, left } ``` *** ### Data selectors using `details` The plugin uses data selectors when deciding if an element should trigger an event using `condition`, or when building the `element` entity's `attributes` property. **JavaScript (tag):** ```javascript snowplow('startElementTracking', { elements: { selector: '.product-card', name: 'product', expose: { when: 'element' }, details: [ // HTML attributes (from getAttribute) { attributes: ['id', 'data-category'] }, // Element properties (may differ from HTML attributes) { properties: ['className', 'tagName'] }, // Dataset values (data-* attributes, camelCase) //

{ dataset: ['productId', 'price'] }, // Text content from child elements { child_text: { name: 'h3', // text from first

brand: '.brand-name' // text from first .brand-name }}, // Regex extraction from element's textContent { content: { sku: /SKU-(\d+)/ // captures first group }}, // Include the selector that matched { selector: true }, // Validate collected attributes - discards results if no match // Useful for filtering in `condition` { match: { category: 'electronics', // exact value match price: (val) => parseFloat(val) > 0 // function match }}, // Custom callback function (element) => ({ isOnSale: element.classList.contains('on-sale') ? 'true' : 'false', position: element.dataset.position }) ] } }); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: '.product-card', name: 'product', expose: { when: 'element' }, details: [ // HTML attributes (from getAttribute) { attributes: ['id', 'data-category'] }, // Element properties (may differ from HTML attributes) { properties: ['className', 'tagName'] }, // Dataset values (data-* attributes, camelCase) //
{ dataset: ['productId', 'price'] }, // Text content from child elements { child_text: { name: 'h3', // text from first
brand: '.brand-name' // text from first .brand-name }}, // Regex extraction from element's textContent { content: { sku: /SKU-(\d+)/ // captures first group }}, // Include the selector that matched { selector: true }, // Validate collected attributes - discards results if no match // Useful for filtering in `condition` { match: { category: 'electronics', // exact value match price: (val) => parseFloat(val) > 0 // function match }}, // Custom callback function (element) => ({ isOnSale: element.classList.contains('on-sale') ? 'true' : 'false', position: element.dataset.position }) ] } }); ``` * ### Conditional event firing with `condition` Only fire events when elements match certain criteria. Use data selectors to define the conditions: JavaScript (tag): ```javascript // Example: only track visible notifications snowplow('startElementTracking', { elements: { selector: '.notification', name: 'notification_shown', create: { when: 'element', condition: [ // Only fire if notification has data-visible="true" { dataset: ['visible'] }, { match: { visible: 'true' } } ] } } }); // Example: only track products that are in stock snowplow('startElementTracking', { elements: { selector: '.product-card', name: 'in_stock_product', expose: { when: 'element', condition: [ { dataset: ['stockStatus'] }, { match: { stockStatus: (val) => val !== 'out-of-stock' } } ] } } }); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; // Example: only track visible notifications startElementTracking({ elements: { selector: '.notification', name: 'notification_shown', create: { when: 'element', condition: [ // Only fire if notification has data-visible="true" { dataset: ['visible'] }, { match: { visible: 'true' } } ] } } }); // Example: only track products that are in stock startElementTracking({ elements: { selector: '.product-card', name: 'in_stock_product', expose: { when: 'element', condition: [ { dataset: ['stockStatus'] }, { match: { stockStatus: (val) => val !== 'out-of-stock' } } ] } } }); ``` * ### Shadow DOM tracking If the elements you want to track exist within [shadow DOM](https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_shadow_DOM) trees, the plugin might not identify them automatically. Use these settings to notify the plugin that it should descend into shadow hosts to identify elements to match the rule against. By default, the plugin matches specified elements both outside and inside `shadowSelector` shadow hosts. Set `shadowOnly` to `true` to only match elements within those shadow hosts. JavaScript (tag): ```javascript snowplow('startElementTracking', { elements: { selector: 'button.submit', name: 'submit_button', shadowSelector: 'my-custom-form', // CSS selector for elements that are shadow hosts containing the targeted elements shadowOnly: true, // only match elements inside shadow DOM, not elsewhere expose: { when: 'element' } } }); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: 'button.submit', name: 'submit_button', shadowSelector: 'my-custom-form', // CSS selector for elements that are shadow hosts containing the targeted elements shadowOnly: true, // only match elements inside shadow DOM, not elsewhere expose: { when: 'element' } } }); ``` * ### Send to specific trackers If you have multiple trackers loaded on the same page, you can specify which trackers should receive events using the `tracker` option. Provide a list of tracker namespaces. If omitted, events go to all trackers the plugin has been activated for. JavaScript (tag): ```javascript snowplow('startElementTracking', { elements: { selector: '.promo-banner' } }, ['tracker1', 'tracker2']); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: { selector: '.promo-banner' } }, ['tracker1', 'tracker2']); ``` * ## Further examples These examples are based on [a snapshot](https://web.archive.org/web/20250422013533/https://snowplow.io/) of the [Snowplow website](https://snowplow.io/). ### Content depth The blog posts have longer-form content. Snowplow's page ping events track scroll depth by pixels, but those measurements become inconsistent between devices and page. To see how much content gets consumed, you can generate stats based on the paragraphs in the content. You can also get periodic stats based on the entire article in page pings. JavaScript (tag): ```javascript snowplow('startElementTracking', { elements: [ { selector: ".blogs_blog-post-body_content", name: "blog content", expose: false, includeStats: ["page_ping"] }, { selector: ".blogs_blog-post-body_content p", name: "blog paragraphs" } ] }); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: [ { selector: ".blogs_blog-post-body_content", name: "blog content", expose: false, includeStats: ["page_ping"] }, { selector: ".blogs_blog-post-body_content p", name: "blog paragraphs" } ] }); ``` * Because the expose event contains the `element_index` and `element_matches`, you can easily query the largest `element_index` by page view ID. The result tells you consumption statistics for individual views of each article. You can then summarize that metric to the content or category level, or converted to a percentage by comparing with `element_matches`. ```json { "schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0", "data": [ { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "blog paragraphs", "width": 800, "height": 48, "position_x": 320, "position_y": 533.25, "doc_position_x": 320, "doc_position_y": 1373, "element_index": 6, "element_matches": 24, "originating_page_view": "f390bec5-f63c-48af-b3ad-a03f0511af7f", "attributes": [] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0", "data": { "id": "f390bec5-f63c-48af-b3ad-a03f0511af7f" } } ] } ``` The periodic page ping events also give you a summary of the total progress in the `max_y_depth_ratio`/`max_y_depth` values. With `y_depth_ratio` you can also see when users backtrack up the page. ```json { "schema": "iglu:com.snowplowanalytics.snowplow/element_statistics/jsonschema/1-0-0", "data": { "element_name": "blog content", "element_index": 1, "element_matches": 1, "current_state": "unknown", "min_size": "800x3928", "current_size": "800x3928", "max_size": "800x3928", "y_depth_ratio": 0.20302953156822812, "max_y_depth_ratio": 0.4931262729124236, "max_y_depth": "1937/3928", "element_age_ms": 298379, "times_in_view": 0, "total_time_visible_ms": 0 } } ``` ### Simple funnels A newsletter sign-up form exists at the bottom of the page. Performance measurement becomes difficult because many visitors don't even see it. To test this you first need to know: - When the form exists on a page - When the form is actually seen - When people actually interact with the form - When the form is finally submitted The form tracking plugin can only do the last parts, but the element tracker gives you the earlier steps. If you end up adding more forms in the future, you'll want to know which is which, so you can mark the footer as a component so you can split it out later. JavaScript (tag): ```javascript snowplow('startElementTracking', { elements: [ { selector: ".hbspt-form", name: "newsletter signup", create: true, }, { selector: "footer", component: true, expose: false } ] }); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: [ { selector: ".hbspt-form", name: "newsletter signup", create: true, }, { selector: "footer", component: true, expose: false } ] }); ``` * If you try this on a blog page, you actually get two `create_element` events. Blog posts have a second newsletter sign-up in a sidebar next to the content. Because only the second form is a member of the `footer` component, you can easily see which one you are trying to measure when you query the data later. ```json { "schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0", "data": [ { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "newsletter signup", "width": 336, "height": 161, "position_x": 1232, "position_y": 238.88333129882812, "doc_position_x": 1232, "doc_position_y": 3677.883331298828, "element_index": 1, "element_matches": 2, "originating_page_view": "02e30714-a84a-42f8-8b07-df106d669db0", "attributes": [] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0", "data": { "id": "02e30714-a84a-42f8-8b07-df106d669db0" } } ] } ``` ```json { "schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0", "data": [ { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "newsletter signup", "width": 560, "height": 137, "position_x": 320, "position_y": 1953.5, "doc_position_x": 320, "doc_position_y": 5392.5, "element_index": 2, "element_matches": 2, "originating_page_view": "02e30714-a84a-42f8-8b07-df106d669db0", "attributes": [] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/component_parents/jsonschema/1-0-0", "data": { "element_name": "newsletter signup", "component_list": [ "footer" ] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "footer", "width": 1920, "height": 1071.5, "position_x": 0, "position_y": 1212, "doc_position_x": 0, "doc_position_y": 4651, "originating_page_view": "", "attributes": [] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0", "data": { "id": "02e30714-a84a-42f8-8b07-df106d669db0" } } ] } ``` ### Recommendations performance The homepage contains a section for the "Latest Blogs from Snowplow." This could represent recommendations or some other form of personalization. If it did, one might want to optimize it. Link tracking could tell you when a recommendation worked and a visitor clicked it, but how would identify the recommendation not encouraging clicks? If you track when the widget becomes visible and include the items that got recommended, you could correlate that with the clicks to measure performance. For fairer measurement of visibility, you can configure that visibility only counts if at least 50% is in view, and it has to be on screen for at least 1.5 seconds. You'll also collect the post title and author information. JavaScript (tag): ```javascript snowplow('startElementTracking', { elements: [ { selector: ".blog_list-header_list-wrapper", name: "recommended_posts", create: true, expose: { when: "element", minTimeMillis: 1500, minPercentage: 0.5 }, contents: [ { selector: ".collection-item", name: "recommended_item", details: { child_text: { title: "h3", author: ".blog_list-header_author-text > p" } } } ] } ] }); ``` Browser (npm): ```javascript import { startElementTracking } from '@snowplow/browser-plugin-element-tracking'; startElementTracking({ elements: [ { selector: ".blog_list-header_list-wrapper", name: "recommended_posts", create: true, expose: { when: "element", minTimeMillis: 1500, minPercentage: 0.5 }, contents: [ { selector: ".collection-item", name: "recommended_item", details: { child_text: { title: "h3", author: ".blog_list-header_author-text > p" } } } ] } ] }); ``` * Scrolling down to see the items and you see the items that get served to the visitor: ```json { "schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0", "data": [ { "schema": "iglu:com.snowplowanalytics.snowplow/element/jsonschema/1-0-0", "data": { "element_name": "recommended_posts", "width": 1280, "height": 680.7666625976562, "position_x": 320, "position_y": 437.70001220703125, "doc_position_x": 320, "doc_position_y": 6261.066711425781, "element_index": 1, "element_matches": 1, "originating_page_view": "034db1d6-1d60-42ca-8fe1-9aafc0442a22", "attributes": [] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/element_content/jsonschema/1-0-0", "data": { "element_name": "recommended_item", "parent_name": "recommended_posts", "parent_position": 1, "position": 1, "attributes": [ { "source": "child_text", "attribute": "title", "value": "Data Pipeline Architecture Patterns for AI: Choosing the Right Approach" }, { "source": "child_text", "attribute": "author", "value": "Matus Tomlein" } ] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/element_content/jsonschema/1-0-0", "data": { "element_name": "recommended_item", "parent_name": "recommended_posts", "parent_position": 1, "position": 2, "attributes": [ { "source": "child_text", "attribute": "title", "value": "Data Pipeline Architecture For AI: Why Traditional Approaches Fall Short" }, { "source": "child_text", "attribute": "author", "value": "Matus Tomlein" } ] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/element_content/jsonschema/1-0-0", "data": { "element_name": "recommended_item", "parent_name": "recommended_posts", "parent_position": 1, "position": 3, "attributes": [ { "source": "child_text", "attribute": "title", "value": "Agentic AI Applications: How They Will Turn the Web Upside Down" }, { "source": "child_text", "attribute": "author", "value": "Yali\tSassoon" } ] } }, { "schema": "iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0", "data": { "id": "034db1d6-1d60-42ca-8fe1-9aafc0442a22" } } ] } ``` --- # Track errors on web > Track handled and unhandled JavaScript exceptions with manual error tracking and automatic error tracking capabilities. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/errors/ The Errors tracker plugin provides two ways of tracking exceptions: manual tracking of handled exceptions using `trackError` and automatic tracking of unhandled exceptions using `enableErrorTracking`. Error events can be manually tracked and/or automatically tracked. ## Install plugin JavaScript (tag): | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | Download: | | | | ------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-error-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-error-tracking@latest/dist/index.umd.min.js) (latest) | Note: The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-error-tracking@latest/dist/index.umd.min.js", ["snowplowErrorTracking", "ErrorTrackingPlugin"] ); ``` Browser (npm): - `npm install @snowplow/browser-plugin-error-tracking` - `yarn add @snowplow/browser-plugin-error-tracking` - `pnpm add @snowplow/browser-plugin-error-tracking` ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { ErrorTrackingPlugin, enableErrorTracking } from '@snowplow/browser-plugin-error-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ ErrorTrackingPlugin() ], }); enableErrorTracking(); ``` * ## Manual error tracking Use the `trackError` method to track handled exceptions (application errors) in your JS code. This is its signature: JavaScript (tag): ```javascript snowplow('trackError', { /** The error message */ message: string; /** The filename where the error occurred */ filename?: string; /** The line number which the error occurred on */ lineno?: number; /** The column number which the error occurred on */ colno?: number; /** The error object */ error?: Error; }); ``` Browser (npm): ```javascript trackError({ /** The error message */ message: string; /** The filename where the error occurred */ filename?: string; /** The line number which the error occurred on */ lineno?: number; /** The column number which the error occurred on */ colno?: number; /** The error object / error?: Error; }); ``` | Name | Required? | Description | Type | | ---------- | ------------- | ----------------------------------- | ---------- | | `message` | Yes | Error message | string | | `filename` | No | Filename or URL | string | | `lineno` | No | Line number of problem code chunk | number | | `colno` | No | Column number of problem code chunk | number | | `error` | No | JS `ErrorEvent` | ErrorEvent | Of these arguments, only `message` is required. Signature of this method defined to match `window.onerror` callback in modern browsers. JavaScript (tag): ```javascript try { var user = getUser() } catch(e) { snowplow('trackError', { message: 'Cannot get user object', filename: 'shop.js', error: e }); } ``` Browser (npm): ```javascript try { var user = getUser() } catch(e) { trackError({ message: 'Cannot get user object', filename: 'shop.js', error: e }); } ``` * Using `trackError` it's assumed that developer knows where errors could happen, which is not often the case. Therefor it's recommended to use `enableErrorTracking` as it allows you to discover errors that weren't expected. ## Automatic error tracking Use the `enableErrorTracking` method to track unhandled exceptions (application errors) in your JS code. This is its signature: JavaScript (tag): ```javascript snowplow('enableErrorTracking', { /** A callback which allows on certain errors to be tracked */ filter?: (error: ErrorEvent) => boolean; /** A callback to dynamically add extra context based on the error */ contextAdder?: (error: ErrorEvent) => Array; /** Context to be added to every error */ context?: Array; } ``` Browser (npm): ```javascript enableErrorTracking({ /** A callback which allows on certain errors to be tracked */ filter?: (error: ErrorEvent) => boolean; /** A callback to dynamically add extra context based on the error */ contextAdder?: (error: ErrorEvent) => Array; /** Context to be added to every error / context?: Array; }); ``` | Name | Required? | Description | Type | | -------------- | ------------- | ------------------------------- | ------------------------------------------- | | `filter` | No | Predicate to filter exceptions | `(ErrorEvent) => Boolean` | | `contextAdder` | No | Function to get dynamic context | `(ErrorEvent) => Array` | | context | No | Additional custom context | `Array` | Unlike `trackError` you need enable error tracking only once: JavaScript (tag): ```javascript snowplow('enableErrorTracking') ``` Browser (npm): ```javascript enableErrorTracking(); ``` * Application error events are implemented as Snowplow self-describing events. [Here](https://raw.githubusercontent.com/snowplow/iglu-central/master/schemas/com.snowplowanalytics.snowplow/application_error/jsonschema/1-0-1) is the schema for an `application_error` event. --- # Integrate with your event specifications on web > Use the event specifications plugin to automatically attach event specification entities from your tracking plan to matching browser events. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/event-specifications/ This plugin allows you to integrate with [Media Web](/docs/event-studio/tracking-plans/templates/#media-web) event specifications. The plugin will add an event specification entity to the matching [Snowplow media](/docs/events/ootb-data/media-events/) events. Retrieve the configuration directly from your [tracking plan](https://docs.snowplow.io/docs/fundamentals/tracking-plans/) in [Snowplow Console](https://console.snowplowanalytics.com). > Note: The plugin is available since version 3.23 of the tracker. It's only available for tracking plans created using the [Media Web template](/docs/event-studio/tracking-plans/templates/#media-web). The event specification entity is automatically tracked once configured. ## Install plugin JavaScript (tag): | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | Download: | | | | ------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-event-specifications@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-event-specifications@latest/dist/index.umd.min.js) (latest) | ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-event-specifications@latest/dist/index.umd.min.js', ['eventSpecifications', 'EventSpecificationsPlugin'] ); ``` Browser (npm): - `npm install @snowplow/browser-plugin-event-specifications` - `yarn add @snowplow/browser-plugin-event-specifications` - `pnpm add @snowplow/browser-plugin-event-specifications` ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { EventSpecificationsPlugin } from '@snowplow/browser-plugin-event-specifications'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ EventSpecificationsPlugin(/* plugin configuration /) ], }); ``` ## Configuration You can retrieve the configuration for your event specifications directly from your tracking plan after clicking on the `Implement tracking` button. ![implement tracking button](/assets/images/implement_tracking-237d544d543211d13699e36aac03fc1c.png) Configure the plugin by mapping each tracked event to the event specification ID from your tracking plan. For example: JavaScript (tag): ```javascript // Initialize tracker window.snowplow('newTracker', 'sp1', '{{collector_url}}', { appId: 'my-app-id' }); // Add the Media plugin window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-media@latest/dist/index.umd.min.js', ['snowplowMedia', 'SnowplowMediaPlugin'] ); // Add the Event Specifications plugin with configuration window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-event-specifications@latest/dist/index.umd.min.js', ['eventSpecifications', 'EventSpecificationsPlugin'], [ { SnowplowMediaPlugin: { // Map event names to your event specification IDs from Console "ad_break_end_event": "abb057a9-eb05-41b8-8d13-0a020f5f9960", "ad_break_start_event": "896fc117-aad9-4ef1-ab52-13fcf9156a08", "ad_click_event": "df27c1dd-b7e6-4d4a-8ba1-613d859594c4", "ad_complete_event": "a382f36b-39ed-46b4-9e6f-ac9bd1d65360", "ad_pause_event": "6bd62180-37ab-4a9c-9aa4-580aa39d7888", "ad_quartile_event": "7d946906-80eb-4ca0-bf7d-4a0f04ae3598", "ad_resume_event": "d5ae264a-3983-478b-b9d5-bcf46c66cab1", "ad_skip_event": "5f79e53b-9318-4644-b4e8-8bf7804c244b", "ad_start_event": "442a8e75-4884-434e-8e1d-d80bc35c4157", "buffer_end_event": "8951ce07-b497-45b0-81c3-49962d36fa6a", "buffer_start_event": "c53f4ea7-e7c3-44d9-97a8-1edf5a61d898", "error_event": "594bd013-8e6b-4ec4-9828-e5e609b4297c", "fullscreen_change_event": "3780c3e5-17ed-4e39-b22c-c0568c486bf3", "ping_event": "a4870ad5-e028-42ae-bfca-603a3d6837f1", "pause_event": "bf90af15-840d-4a76-a7f0-ccc8865a9c5c", "percent_progress_event": "6684eea3-82e6-4c2e-98db-ab0be61fdf0d", "picture_in_picture_change_event": "2e8be82e-11fb-4aa3-a5a2-7f49efc29abb", "end_event": "aaac78f1-8ee4-42a6-8e3c-46f660c32709", "play_event": "1094455c-4e99-4e1f-8445-f4fb12b4eccc", "quality_change_event": "bdf91319-c5f0-476f-922f-9215b76186af", "ready_event": "c1f9f850-dc68-47d8-9fd1-0db10328858c", "seek_end_event": "d748d09f-a361-4620-8651-f883b1502a23", "seek_start_event": "5fb03fae-3f0f-4538-908b-a55a6f7e69cb", "volume_change_event": "972836c4-73b8-45c3-abcf-22e3bd7eae6c" } } ] ); ``` Browser (npm): ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { SnowplowMediaPlugin, enableMediaTracking } from '@snowplow/browser-plugin-media'; import { EventSpecificationsPlugin } from '@snowplow/browser-plugin-event-specifications'; // Initialize tracker with both plugins newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ SnowplowMediaPlugin(), EventSpecificationsPlugin({ SnowplowMediaPlugin: { // Map event names to your event specification IDs from Console "ad_break_end_event": "abb057a9-eb05-41b8-8d13-0a020f5f9960", "ad_break_start_event": "896fc117-aad9-4ef1-ab52-13fcf9156a08", "ad_click_event": "df27c1dd-b7e6-4d4a-8ba1-613d859594c4", "ad_complete_event": "a382f36b-39ed-46b4-9e6f-ac9bd1d65360", "ad_pause_event": "6bd62180-37ab-4a9c-9aa4-580aa39d7888", "ad_quartile_event": "7d946906-80eb-4ca0-bf7d-4a0f04ae3598", "ad_resume_event": "d5ae264a-3983-478b-b9d5-bcf46c66cab1", "ad_skip_event": "5f79e53b-9318-4644-b4e8-8bf7804c244b", "ad_start_event": "442a8e75-4884-434e-8e1d-d80bc35c4157", "buffer_end_event": "8951ce07-b497-45b0-81c3-49962d36fa6a", "buffer_start_event": "c53f4ea7-e7c3-44d9-97a8-1edf5a61d898", "error_event": "594bd013-8e6b-4ec4-9828-e5e609b4297c", "fullscreen_change_event": "3780c3e5-17ed-4e39-b22c-c0568c486bf3", "ping_event": "a4870ad5-e028-42ae-bfca-603a3d6837f1", "pause_event": "bf90af15-840d-4a76-a7f0-ccc8865a9c5c", "percent_progress_event": "6684eea3-82e6-4c2e-98db-ab0be61fdf0d", "picture_in_picture_change_event": "2e8be82e-11fb-4aa3-a5a2-7f49efc29abb", "end_event": "aaac78f1-8ee4-42a6-8e3c-46f660c32709", "play_event": "1094455c-4e99-4e1f-8445-f4fb12b4eccc", "quality_change_event": "bdf91319-c5f0-476f-922f-9215b76186af", "ready_event": "c1f9f850-dc68-47d8-9fd1-0db10328858c", "seek_end_event": "d748d09f-a361-4620-8651-f883b1502a23", "seek_start_event": "5fb03fae-3f0f-4538-908b-a55a6f7e69cb", "volume_change_event": "972836c4-73b8-45c3-abcf-22e3bd7eae6c" } }) ] }); ``` * ## Event specification entity When an event is tracked that matches one of the configured event names, the plugin will automatically add an event specification entity to it. ### `event_specification` Type: Entity Entity schema for referencing an event specification Schema: `iglu:com.snowplowanalytics.snowplow/event_specification/jsonschema/1-0-0` Example: ```json { "id": "abb057a9-eb05-41b8-8d13-0a020f5f9960" } ``` Properties: | Property | Description | | ------------- | ---------------------------------------------------------------------------- | | `id` _string_ | _Required._ Identifier for the event specification that the event adheres to | --- # Track Kantar Focal Meter events on web > Integrate with Kantar Focal Meter router meters to measure content audience by sending domain user IDs to Focal Meter endpoints. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/focalmeter/ This plugin provides integration with [Focal Meter by Kantar](https://www.virtualmeter.co.uk/focalmeter). Focal Meter is a box that connects directly to the broadband router and collects viewing information for the devices on your network. This integration enables measuring the audience of content through the Focal Meter router meter. The plugin has the ability to send the [domain user ID](/docs/fundamentals/canonical-event/#user-fields) to a [Kantar Focal Meter](https://www.virtualmeter.co.uk/focalmeter) endpoint. A request is made when the first event with a new user ID is tracked. The plugin inspects the domain user ID property in tracked events. Whenever it changes from the previously recorded value, it makes an HTTP GET request to the `kantarEndpoint` URL with the ID as a query parameter. Optionally, the tracker may store the last published domain user ID value in local storage in order to prevent it from making the same request on the next page load. If local storage is not used, the request is made on each page load. > Note: The plugin is available since version 3.16 of the tracker. The Focal Meter integration is automatic once configured. ## Install plugin JavaScript (tag): | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ❌ | | `sp.lite.js` | ❌ | Download: | | | | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-focalmeter@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-focalmeter@latest/dist/index.umd.min.js) (latest) | Browser (npm): - `npm install @snowplow/browser-plugin-focalmeter` - `yarn add @snowplow/browser-plugin-focalmeter` - `pnpm add @snowplow/browser-plugin-focalmeter` * ## Enable integration JavaScript (tag): To integrate with the Kantar FocalMeter, use the snippet below after [setting up your tracker](/docs/sources/web-trackers/quick-start-guide/): ```javascript window.snowplow( 'addPlugin', 'https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-focalmeter@latest/dist/index.umd.min.js', ['snowplowFocalMeter', 'FocalMeterPlugin'] ); window.snowplow('enableFocalMeterIntegration', { kantarEndpoint: '{{kantar_url}}', useLocalStorage: false // optional, defaults to false }); ``` Browser (npm): ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { FocalMeterPlugin, enableFocalMeterIntegration } from '@snowplow/browser-plugin-focalmeter'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ FocalMeterPlugin() ], }); enableFocalMeterIntegration({ kantarEndpoint: '{{kantar_url}}', useLocalStorage: false // optional, defaults to false }); ``` * The `enableFocalMeterIntegration` function has the following arguments: | Parameter | Type | Default | Description | Required | | ----------------- | ---------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | | `kantarEndpoint` | `string` | - | URL of the Kantar endpoint to send the requests to (including protocol) | Yes | | `processUserId` | `(userId: string) => string` | - | Callback to process user ID before sending it in a request. This may be used to apply hashing to the value. | No | | `useLocalStorage` | `boolean` | `false` | Whether to store information about the last submitted user ID in local storage to prevent sending it again on next load (defaults not to use local storage) | No | If you choose to storage the last submitted user ID in local storage, the plugin will use the key `sp-fclmtr-{trackerId}`. The `trackerId` is your tracker namespace. ### Processing the user ID By default, the plugin sends the domain user ID as a GET parameter in requests to Kantar without modifying it. In case you want to apply some transformation on the value, such as hashing, you can provide the `processUserId` callback in the `enableFocalMeterIntegration` call: JavaScript (tag): ```javascript window.snowplow('enableFocalMeterIntegration', { kantarEndpoint: "https://kantar.example.com", processUserId: (userId) => md5(userId).toString(), // apply the custom hashing here }); ``` Browser (npm): ```javascript import md5 from 'crypto-js/md5'; enableFocalMeterIntegration({ kantarEndpoint: "https://kantar.example.com", processUserId: (userId) => md5(userId).toString(), // apply the custom hashing here }); ``` * ### Configure multiple trackers If you have multiple trackers loaded on the same page, you can enable the Focal Meter integration for each of them by specifying the tracker namespace as the third parameter to the `enableFocalMeterIntegration` function: JavaScript (tag): ```javascript window.snowplow( 'enableFocalMeterIntegration', { kantarEndpoint: 'https://kantar.example.com' }, ['sp1', 'sp2'] // Only these tracker namespaces will send to Kantar ); ``` Browser (npm): ```javascript enableFocalMeterIntegration( { kantarEndpoint: 'https://kantar.example.com' }, ['sp1', 'sp2'] // Only these tracker namespaces will send to Kantar ); ``` * ## Request format The tracker will send requests with this format: ```text GET https://your-kantar-endpoint.com?vendor=snowplow&cs_fpid=d5c4f9a2-3b7e-4d1f-8c6a-9e2b5f0a3c8d&c12=not_set ``` Where: - `vendor` is always `snowplow` - `cs_fpid` is the domain user ID, or the processed version if a `processUserId` callback is provided - `c12` is always `not_set` --- # Track form interactions on web > Automatically track form changes, submissions, and focus events with configurable allowlists, denylists, and transform functions for field values. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/form-tracking/ Snowplow form tracking creates three event types: `change_form`, `submit_form` and `focus_form`. Using the `enableFormTracking` method adds event listeners to the document listening for events from form elements and their interactive fields (that is, all `input`, `textarea`, and `select` elements). > Note: Events on password fields will not be tracked. Form events are automatically tracked once configured. ## Installation JavaScript (tag): | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | Download: | | | | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-form-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-form-tracking@latest/dist/index.umd.min.js) (latest) | Note: The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. Browser (npm): - `npm install @snowplow/browser-plugin-form-tracking` - `yarn add @snowplow/browser-plugin-form-tracking` - `pnpm add @snowplow/browser-plugin-form-tracking` * ## Toggle form tracking Start tracking form events by enabling the plugin: JavaScript (tag): ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-form-tracking@latest/dist/index.umd.min.js", ["snowplowFormTracking", "FormTrackingPlugin"] ); snowplow('enableFormTracking'); ``` Browser (npm): Initialize your tracker with the plugin. ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { FormTrackingPlugin, enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ FormTrackingPlugin() ], }); enableFormTracking(); ``` * To stop form tracking, call `disableFormTracking`: JavaScript (tag): ```javascript snowplow('disableFormTracking'); ``` Browser (npm): ```javascript import { FormTrackingPlugin, disableFormTracking } from '@snowplow/browser-plugin-form-tracking'; disableFormTracking(); ``` * ## Events By default, all three event types are tracked. However, it is possible to subscribe only to specific event types using the `options.events` option when enabling form tracking: JavaScript (tag): ```javascript // subscribing to specific event types snowplow('enableFormTracking', { options: { events: ['submit_form', 'focus_form', 'change_form'] }, }); ``` Browser (npm): ```javascript // subscribing to specific event types enableFormTracking({ options: { events: ['submit_form', 'focus_form', 'change_form'] }, }); ``` * Check out the [form tracking overview](/docs/events/ootb-data/page-elements/#form-interactions) page to see the schema details. ### Change form When a user changes the value of a `textarea`, `input`, or `select` element inside a form, a [`change_form`](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/change_form/jsonschema/1-0-0) event will be fired. It will capture the name, type, and new value of the element, and the id of the parent form. ### Submit form When a user submits a form, a [`submit_form`](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/submit_form/jsonschema/1-0-0) event will be fired. It will capture the id and classes of the form and the name, type, and value of all `textarea`, `input`, and `select` elements inside the form. Note that this will only work if the original form submission event is actually fired. If you prevent it from firing, for example by using a jQuery event handler which returns `false` to handle clicks on the form's submission button, the Snowplow `submit_form` event will not be fired. ### Focus form When a user focuses on a form element, a [`focus_form`](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/focus_form/jsonschema/1-0-0) event will be fired. It will capture the id and classes of the form and the name, type, and value of the `textarea`, `input`, or `select` element inside the form that received focus. ## Configuration It may be that you do not want to track every field in a form, or every form on a page. You can customize form tracking by passing a configuration argument to the `enableFormTracking` method. This argument should be an object with two elements named "forms" and "fields". The "forms" element determines which forms will be tracked; the "fields" element determines which fields inside the tracked forms will be tracked. As with link click tracking, there are three ways to configure each field: a denylist, an allowlist, or a filter function. You do not have to use the same method for both fields. Denylists This is an array of strings used to prevent certain elements from being tracked. Any form with a CSS class in the array will be ignored. Any field whose name property is in the array will be ignored. All other elements will be tracked. Allowlists This is an array of strings used to turn on tracking. Any form with a CSS class in the array will be tracked. Any field in a tracked form whose name property is in the array will be tracked. All other elements will be ignored. Filter functions This is a function used to determine which elements are tracked. The element is passed as the argument to the function and is tracked if and only if the value returned by the function is truthy. Event phase From v4 onwards, this plugin uses [capture-phase](https://developer.mozilla.org/en-US/docs/Web/API/Event/eventPhase#value) event listeners to detect form events. The capture phase is the earliest phase of event handlers, so events might be tracked before other code executes (e.g. form validation). If your filter or transform functions are relying on other event handlers to have executed to function correctly, they may not behave as expected when using capture-phase event handlers. From v4.6.8 onwards, the plugin supports a `useCapture` option, which you can set to `false` (default is `true`) to revert to the v3 behavior of using bubble-phase event handlers. This allows other event handlers time to execute before the event is detected and your filter/transform functions are executed. When using the bubble phase, other event handlers may [cancel the event's propagation](https://developer.mozilla.org/en-US/docs/Web/API/Event/stopPropagation) and the plugin will not receive the event and nothing will be tracked. This may be desirable if you want to wait for the form to validate before tracking a "form\_submit" event, for example. Native HTML form validation automatically prevents the "submit" event firing until the form is valid, so only validation code that doesn't integrate with native APIs should require explicitly using the bubble phase. The `focus` event for form fields [does not bubble](https://developer.mozilla.org/en-US/docs/Web/API/Element/focus_event), so this setting is ignored for `form_focus` tracking, which will always use capture-phase event listeners; only "change" and "submit" handlers will use the bubble phase when setting `useCapture: false`. ### Transform functions This is a function used to transform data in each form field. The value and element are passed as arguments to the function and the tracked value is replaced by the value returned. The transform function receives three arguments: 1. The value of the element. 2. Either the HTML element (for `change_form` and `focus_form` events) or an instance of `ElementData` (for `submit_form` events). 3. The HTML element (in all form tracking events). The function signature is: ```typescript type transformFn = ( elementValue: string | null, elementInfo: ElementData | TrackedHTMLElement, elt: TrackedHTMLElement ) => string | null; ``` This means that you can specify a transform function that applies the exact same logic to all `submit_form`, `change_form` and `focus_form` events independent of the element's attributes the logic may depend on. For example: JavaScript (tag): ```javascript function redactPII(eltValue, _, elt) { if (elt.id === 'pid') { return 'redacted'; } return eltValue; } snowplow('enableFormTracking', { options: { fields: { transform: redactPII }, }, }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; function redactPII(eltValue, _, elt) { if (elt.id === 'pid') { return 'redacted'; } return eltValue; } enableFormTracking({ options: { fields: { transform: redactPII }, }, }); ``` * ### Examples To track every form element and every field except those fields named "password": JavaScript (tag): ```javascript var opts = { forms: { denylist: [] }, fields: { denylist: ['password'] } }; snowplow('enableFormTracking', { options: opts }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; var options = { forms: { denylist: [] }, fields: { denylist: ['password'] } }; enableFormTracking({ options }); ``` * To track only the forms with CSS class "tracked", and only those fields whose ID is not "private": JavaScript (tag): ```javascript var opts = { forms: { allowlist: ["tracked"] }, fields: { filter: function (elt) { return elt.id !== "private"; } } }; snowplow('enableFormTracking', { options: opts }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; var opts = { forms: { allowlist: ["tracked"] }, fields: { filter: function (elt) { return elt.id !== "private"; } } }; enableFormTracking({ options: opts }); ``` * To transform the form fields with an MD5 hashing function: JavaScript (tag): ```javascript function hashMD5(value, _, elt) { // can use elt to make transformation decisions return MD5(value); } var opts = { forms: { allowlist: ["tracked"] }, fields: { filter: function (elt) { return elt.id !== "private"; }, transform: hashMD5 } }; snowplow('enableFormTracking', { options: opts }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; function hashMD5(value, _, elt) { // can use elt to make transformation decisions return MD5(value); } var options = { forms: { allowlist: ["tracked"] }, fields: { filter: function (elt) { return elt.id !== "private"; }, transform: hashMD5 } }; enableFormTracking({ options }); ``` * To use the bubble-phase event listeners: JavaScript (tag): ```javascript snowplow('enableFormTracking', { options: { useCapture: false } }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; enableFormTracking({ options: { useCapture: false } }); ``` * ## Tracking forms embedded inside iframes The options for tracking forms inside of iframes are limited – browsers block access to contents of iframes that are from different domains than the parent page. We are not able to provide a solution to track events using trackers initialized on the parent page in such cases. It is possible to track events from forms embedded in iframes loaded from the same domain as the parent page or iframes created using JavaScript on the parent page (e.g. HubSpot forms). In case you are able to access form elements inside an iframe, you can pass them in the `options.forms` argument when calling `enableFormTracking` on the parent page. This will enable form tracking for the specific form elements. The feature may also be used for forms not embedded in iframes, but it's most useful in this particular case. The following example shows how to identify the form elements inside an iframe and pass them to the `enableFormTracking` function: JavaScript (tag): ```javascript let iframe = document.getElementById('form_iframe'); // find the element for the iframe let forms = iframe.contentWindow.document.getElementsByTagName('form'); // find form elements inside the iframe snowplow('enableFormTracking', { options: { forms: forms // pass the embedded forms when enabling form tracking }, }); ``` Browser (npm): ```javascript let iframe = document.getElementById('form_iframe'); // find the element for the iframe let forms = iframe.contentWindow.document.getElementsByTagName('form'); // find form elements inside the iframe enableFormTracking({ options: { forms: forms // pass the embedded forms when enabling form tracking }, }); ``` * Alternatively, you can specify the iframe's `document` as a `target` directly; this will enable form tracking for all forms within the iframe's document: JavaScript (tag): ```javascript let iframe = document.getElementById('form_iframe'); // find the element for the iframe let formDoc = iframe.contentWindow.document; // find iframe document that contains forms snowplow('enableFormTracking', { options: { targets: [document, formDoc] // pass the embedded document when enabling form tracking }, }); ``` Browser (npm): ```javascript let iframe = document.getElementById('form_iframe'); // find the element for the iframe let formDoc = iframe.contentWindow.document; // find iframe document that contains forms enableFormTracking({ options: { targets: [document, formDoc] // pass the embedded document when enabling form tracking }, }); ``` * `targets` can also be used to only track subsets of a document by passing a parent element directly. ## Tracking forms from inside shadow trees Forms created within [shadow trees](https://developer.mozilla.org/en-US/docs/Glossary/Shadow_tree) (e.g. within custom [Web Components](https://developer.mozilla.org/en-US/docs/Web/API/Web_components)) can only be tracked once the user first focuses a field. The plugin relies on composed events to detect the form interactions at the document level. Only `focus` events are considered composed, `change` and `submit` events are not composed and so are not automatically detected by the plugin. When the user focuses a field in a form that is detected as being inside a shadow tree, the event listeners are added directly to the `form` element within the shadow tree in addition to the document-level event listeners in order to track future `change` and `submit` events correctly. If the form has no interactive field elements to first trigger the `focus` event, any `change` or `submit` events that fire will not be tracked. If the shadow root is attached in "closed" mode, no events will be tracked for elements in that shadow tree, only "open" mode is supported. ## Custom context entities Context entities can be sent with all form tracking events by supplying them in an array in the `context` argument. JavaScript (tag): ```javascript snowplow('enableFormTracking', { options: {}, context: [] }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; enableFormTracking({ options: {}, context: [] }); ``` * These context entities can be dynamic, i.e. they can be traditional self-describing JSON objects, or callbacks that generate valid self-describing JSON objects. For form change events, context generators are passed `(elt, type, value)`, and form submission events are passed `(elt, innerElements)`. A dynamic context could therefore look something like this for form change events: JavaScript (tag): ```javascript let dynamicContext = function (elt, type, value) { // perform operations here to construct the context entity return context; }; snowplow('enableFormTracking', { options: {}, context: [dynamicContext] }); ``` Browser (npm): ```javascript import { enableFormTracking } from '@snowplow/browser-plugin-form-tracking'; var dynamicContext = function (elt, type, value) { // perform operations here to construct the context entity return context; }; enableFormTracking({ options: {}, context: [dynamicContext] }); ``` * --- # Track Google Analytics cookies with the web trackers > Automatically capture Google Analytics cookie values including GA4 and Universal Analytics cookies as context entities on every event. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/ga-cookies/ If this plugin is used, the tracker will look for Google Analytics cookies (GA4/Universal Analytics and "classic" GA; specifically the `_ga` cookie and older `utma`, `utmb`, `utmc`, `utmv`, `__utmz`) and combine their values into event context entities that get sent with every event. GA cookies information is automatically tracked once configured. ## Install plugin JavaScript (tag): | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | Download: | | | | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-ga-cookies@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-ga-cookies@latest/dist/index.umd.min.js) (latest) | Note: The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. Browser (npm): - `npm install @snowplow/browser-plugin-ga-cookies` - `yarn add @snowplow/browser-plugin-ga-cookies` - `pnpm add @snowplow/browser-plugin-ga-cookies` * ## Initialization JavaScript (tag): ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-ga-cookies@latest/dist/index.umd.min.js", ["snowplowGaCookies", "GaCookiesPlugin"], [pluginOptions] // note: for sp.js, pluginOptions can also be specified when calling `newTracker`; e.g. `contexts: { gaCookies: { ua: true, ga4: false } }` ); ``` Browser (npm): ```javascript import { newTracker, trackPageView } from '@snowplow/browser-tracker'; import { GaCookiesPlugin } from '@snowplow/browser-plugin-ga-cookies'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ GaCookiesPlugin(pluginOptions) ], }); ``` * The `pluginOptions` parameter allows to configure the plugin. Its type is: ```javascript interface GACookiesPluginOptions { ua?: boolean; ga4?: boolean; ga4MeasurementId?: string | string[]; cookiePrefix?: string | string[]; } ``` | Name | Default | Description | | ---------------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ua | `false` | Send Universal Analytics specific cookie values. | | ga4 | `true` | Send Google Analytics 4 specific cookie values. | | ga4MeasurementId | `""` | Measurement id(s) to search the Google Analytics 4 session cookie. Can be a single measurement id as a string or an array of measurement id strings. The cookie has the form of `_ga_` where `` is the data stream container id and `` is the optional `cookie_prefix` option of the gtag.js tracker. | | cookiePrefix | `[]` | Cookie prefix set on the Google Analytics 4 cookies using the `cookie_prefix` option of the gtag.js tracker. | ## Context entities Adding this plugin will automatically capture the following entities: 1. For GA4 cookies: `iglu:com.google.ga4/cookies/jsonschema/1-0-0` (default) ```json { "_ga": "G-1234", "cookie_prefix": "prefix", "session_cookies": [ { "measurement_id": "G-1234", "session_cookie": "567" } ] } ``` 2. For Universal Analytics cookies: `iglu:com.google.analytics/cookies/jsonschema/1-0-0` (if enabled) ```json { "_ga": "GA1.2.3.4" } ``` --- # Track data out-of-the-box with the web trackers > Track page views, structured events, and self-describing events with automatic context entities and custom timestamps using the web trackers. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/ To track an event, the API is slightly different depending if you're using the JavaScript or Browser version of our web tracker. The main built-in events are [page views](/docs/sources/web-trackers/tracking-events/page-views/) and [page pings](/docs/sources/web-trackers/tracking-events/activity-page-pings/). Here's how to track them: JavaScript (tag): ```javascript ``` Browser (npm): ```javascript import { newTracker, trackPageView, enableActivityTracking } from '@snowplow/browser-tracker'; newTracker('sp', '{{collector_url_here}}', { appId: 'my-app-id', }); enableActivityTracking({ minimumVisitLength: 30, heartbeatDelay: 10 }); trackPageView(); ``` * As well as page views and activity tracking, you can track [custom events](/docs/sources/web-trackers/custom-tracking-using-schemas/), or use [plugins](/docs/sources/web-trackers/plugins/) to track a wide range of other events and entities. ## Add contextual data with entities The tracker can be set up to automatically add [entities](/docs/fundamentals/entities/) to every event sent. Most entity autotracking is specifically configured using plugins, which are imported, enabled, and configured individually. However, you can configure some entities directly when instrumenting the tracker, using the [configuration object](/docs/sources/web-trackers/tracker-setup/initialization-options/). | Entity | Usage | Added by default | JavaScript (tag) tracker | Browser (npm) tracker | | ---------------------------------------------------------------------------------------------------- | -------------------------------- | ---------------- | ------------------------ | --------------------- | | [`webPage`](/docs/sources/web-trackers/tracking-events/page-views/#page-view-id-and-web_page-entity) | UUID for the page view | ✅ | `contexts` config | `contexts` config | | [`session`](/docs/sources/web-trackers/tracking-events/session/) | Data about the current session | ❌ | `contexts` config | `contexts` config | | [`browser`](/docs/sources/web-trackers/tracking-events/browsers/) | Properties of the user's browser | ❌ | `contexts` config | `contexts` config | | [`performanceTiming`](/docs/sources/web-trackers/tracking-events/timings/) | Performance timing metrics | ❌ | `contexts` config | Plugin | | [`gaCookies`](/docs/sources/web-trackers/tracking-events/ga-cookies/) | Extract GA cookie values | ❌ | `contexts` config | Plugin | | [`geolocation`](/docs/sources/web-trackers/tracking-events/timezone-geolocation/) | User's geolocation | ❌ | `contexts` config | Plugin | If you're using the `sp.lite.js` JavaScript tracker distribution, only the `webPage`, `session`, and `browser` entities are available out of the box, as the others require plugins that aren't included in that distribution. You can also attach your own [custom entities](/docs/sources/web-trackers/custom-tracking-using-schemas/) to events. For example, here is a page view with an additional custom entity: JavaScript (tag): ```javascript snowplow('trackPageView', { context: [{ schema: "iglu:com.example_company/page/jsonschema/1-2-1", data: { pageType: 'test', lastUpdated: new Date(2021,04,01) } }] }); ``` Browser (npm): ```javascript trackPageView({ context: [{ schema: 'iglu:com.example_company/page/jsonschema/1-2-1', data: { pageType: 'test', lastUpdated: new Date(2021,04,01) } }] }); ``` * > Note: Tracker methods available through plugins do not necessarily support adding custom entities. For those please refer to the corresponding plugin documentation for details. ## Set event properties Certain event properties, including `domain_userid` or `application_id`, can be set as [atomic properties](/docs/fundamentals/canonical-event/) in the raw event. ### Application ID Set the application ID using the `appId` field of the [tracker configuration object](/docs/sources/web-trackers/tracker-setup/initialization-options/). This will be attached to every event the tracker fires. You can set different application IDs on different parts of your site. You can then distinguish events that occur on different applications by grouping results based on `application_id`. ### Application version > Info: The option to track the application version was introduced in version 4.1 of the JavaScript tracker. Set the application ID using the `appVersion` field of the [tracker configuration object](/docs/sources/web-trackers/tracker-setup/initialization-options/). This will be attached to every event the tracker fires using the [application entity](/docs/events/ootb-data/app-information/#entity-definitions). The version of can be a semver-like structure (e.g 1.1.0) or a Git commit SHA hash. ### Application platform Set the application platform using the `platform` field of the [tracker configuration object](/docs/sources/web-trackers/tracker-setup/initialization-options/). This will be attached to every event the tracker fires. Its default value is `web`. For a list of supported platforms, please see the [Snowplow Tracker Protocol](/docs/fundamentals/canonical-event/#application-fields). ### Business user ID The JavaScript Tracker automatically sets a `domain_userid` based on a first party cookie. Read more about cookies [here](/docs/sources/web-trackers/cookies-and-local-storage/). There are many situations, however, when you will want to identify a specific user using an ID generated by one of your business systems. To do this, you use one of the methods described in this section: `setUserId`, `setUserIdFromLocation`, `setUserIdFromReferrer`, and `setUserIdFromCookie`. Typically, companies do this at points in the customer journey where users identify themselves e.g. if they log in. > Note: This will only set the user ID on further events fired while the user is on this page; if you want events on another page to record this user ID too, you must call `setUserId` on the other page as well. #### `setUserId` `setUserId` is the simplest of the four methods. It sets the business user ID to a string of your choice: JavaScript (tag): ```javascript snowplow('setUserId', 'joe.blogs@email.com'); ``` Browser (npm): ```javascript setUserId('joe.blogs@email.com'); ``` * > Note: `setUserId` can also be called using the alias `identifyUser`. #### `setUserIdFromLocation` `setUserIdFromLocation` lets you set the user ID based on a querystring field of your choice. For example, if the URL is `http://www.mysite.com/home?id=user345`, then the following code would set the user ID to “user345”: JavaScript (tag): ```javascript snowplow('setUserIdFromLocation', 'id'); ``` Browser (npm): ```javascript setUserIdFromLocation('id'); ``` * #### `setUserIdFromReferrer` `setUserIdFromReferrer` functions in the same way as `setUserIdFromLocation`, except that it uses the referrer querystring rather than the querystring of the current page. JavaScript (tag): ```javascript snowplow('setUserIdFromReferrer', 'id'); ``` Browser (npm): ```javascript setUserIdFromReferrer('id'); ``` * #### `setUserIdFromCookie` Use `setUserIdFromCookie` to set the value of a cookie as the user ID. For example, if you have a cookie called “cookieid” whose value is “user123”, the following code would set the user ID to “user123”: JavaScript (tag): ```javascript snowplow('setUserIdFromCookie', 'cookieid'); ``` Browser (npm): ```javascript setUserIdFromCookie('cookieid'); ``` * ### Custom page URL and referrer URL The Snowplow JavaScript Tracker automatically tracks the page URL and referrer URL on any event tracked. However, in certain situations, you may want to override the one or both of these URLs with a custom value. For example, this might be desirable if your CMS spits out particularly ugly URLs that are hard to unpick at analysis time. To set a custom page URL, use the `setCustomUrl` method: JavaScript (tag): ```javascript snowplow('setCustomUrl', 'http://mysite.com/checkout-page'); ``` Browser (npm): ```javascript setCustomUrl('http://mysite.com/checkout-page'); ``` * To set a custom referrer, use the `setReferrerUrl` method: JavaScript (tag): ```javascript snowplow('setReferrerUrl', 'http://custom-referrer.com'); ``` Browser (npm): ```javascript setReferrerUrl('http://custom-referrer.com'); ``` * > Tip: On an SPA, the page URL might change without the page being reloaded. Whenever an event is fired, the Tracker checks whether the page URL has changed since the last event. If it has, the page URL is updated and the URL at the time of the last event is used as the referrer. If you use `setCustomUrl`, the page URL will no longer be updated in this way. Similarly if you use `setReferrerUrl`, the referrer URL will no longer be updated in this way. > > To use `setCustomUrl` within an SPA, call it before all `trackPageView` calls. > > If you want to ensure that the original referrer is preserved even though your page URL can change without the page being reloaded, use `setReferrerUrl` like this before sending any events: > > JavaScript (tag): > > ```javascript > snowplow('setReferrerUrl', document.referrer); > ``` > > Browser (npm): > > ```javascript > setReferrerUrl(document.referrer); > ``` > > * ### Custom timestamp Snowplow events have several [timestamps](/docs/events/timestamps/). Every `trackX...()` method in the tracker allows for a custom timestamp, called `trueTimestamp` to be set. In certain circumstances you might want to set the timestamp yourself e.g. if the JS tracker is being used to process historical event data, rather than tracking the events live. In this case you can set the `true_timestamp` for the event. To set the true timestamp add an extra argument to your track method: `{type: 'ttm', value: unixTimestampInMs}`. This example shows how to set a true timestamp for a page view event: JavaScript (tag): ```javascript snowplow('trackPageView', { timestamp: { type: 'ttm', value: 1361553733371 } }); ``` Browser (npm): ```javascript trackPageView({ timestamp: { type: 'ttm', value: 1361553733371 } }); ``` * E.g. to set a true timestamp for a self-describing event: JavaScript (tag): ```javascript snowplow('trackSelfDescribingEvent', { event: { schema: 'iglu:com.acme_company/viewed_product/jsonschema/2-0-0', data: { productId: 'ASO01043', category: 'Dresses', brand: 'ACME', returning: true, price: 49.95, sizes: ['xs', 's', 'l', 'xl', 'xxl'], availableSince: new Date(2013,3,7) } }, timestamp: { type: 'ttm', value: 1361553733371 } }); ``` Browser (npm): ```javascript trackSelfDescribingEvent({ event: { schema: 'iglu:com.acme_company/viewed_product/jsonschema/2-0-0', data: { productId: 'ASO01043', category: 'Dresses', brand: 'ACME', returning: true, price: 49.95, sizes: ['xs', 's', 'l', 'xl', 'xxl'], availableSince: new Date(2013,3,7) } }, timestamp: { type: 'ttm', value: 1361553733371 } }); ``` * ## Get event properties It's possible to retrieve certain identifiers and properties for use in your code. You'll need to use a callback for the JavaScript tracker. JavaScript (tag): If you call `snowplow` with a function as the argument, the function will be executed when `sp.js` loads: ```javascript snowplow(function () { console.log("sp.js has loaded"); }); ``` Or equivalently: ```javascript snowplow(function (x) { console.log(x); }, "sp.js has loaded"); ``` The callback you provide is executed as a method on the internal `trackerDictionary` object. You can access the `trackerDictionary` using `this`. ```javascript // Configure a tracker instance named "sp" snowplow('newTracker', 'sp', '{{COLLECTOR_URL}', { appId: 'snowplowExampleApp' }); // Access the tracker instance inside a callback snowplow(function () { var sp = this.sp; var domainUserId = sp.getDomainUserId(); console.log(domainUserId); }) ``` The callback function shouldn't be a method: ```javascript // TypeError: Illegal invocation snowplow(console.log, "sp.js has loaded"); ``` This won't work because the value of `this` in the `console.log` function will be the `trackerDictionary`, rather than `console`. You can get around this problem using `Function.prototoype.bind` as follows: ```javascript snowplow(console.log.bind(console), "sp.js has loaded"); ``` For more on execution context in JavaScript, see the [MDN page](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/this). Browser (npm): When initialising a tracker, you can use the returned `tracker` instance to access various properties from this tracker instance. ```javascript // Configure a tracker instance named "sp" const sp = newTracker('sp', '{{COLLECTOR_URL}', { appId: 'snowplowExampleApp' }); // Access the tracker properties const domainUserId = sp.getDomainUserId(); ``` * ### Cookie values You can [retrieve cookie values](/docs/sources/web-trackers/cookies-and-local-storage/getting-cookie-values/) using the `getDomainUserInfo` and other getters, or from the cookies directly. ### Page view ID When the JavaScript Tracker loads on a page, it generates a new [page view UUID](/docs/sources/web-trackers/tracking-events/page-views/). To get this page view ID, use the `getPageViewId` method: JavaScript (tag): ```javascript // Access the tracker instance inside a callback snowplow(function () { var sp = this.sp; var pageViewId = sp.getPageViewId(); console.log(pageViewId); }) ``` Browser (npm): ```javascript const pageViewId = sp.getPageViewId(); console.log(pageViewId); ``` * ### Business user ID The `getUserId` method returns the user ID which you configured using `setUserId()`: JavaScript (tag): ```javascript // Access the tracker instance inside a callback snowplow(function () { var sp = this.sp; var userId = sp.getUserId(); console.log(userId); }) ``` Browser (npm): ```javascript const userId = sp.getUserId(); console.log(userId); ``` * ### Tab ID If you've enabled the [`browser` entity](/docs/sources/web-trackers/tracking-events/browsers/), you can get the tab ID using the `getTabId` method. It's a UUID identifier for the specific browser tab the event is sent from. JavaScript (tag): ```javascript // Access the tracker instance inside a callback snowplow(function () { var sp = this.sp; var tabId = sp.getTabId(); console.log(tabId); }) ``` Browser (npm): ```javascript const tabId = sp.getTabId(); console.log(tabId); ``` * --- # Track link clicks on web > Automatically track clicks on anchor elements with configurable filters, pseudo-click support, and optional content capture for href destinations. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/link-click/ Link click tracking enables click tracking for all anchor/link elements (HTML `` elements). Link clicks are tracked as self describing events with [the link\_click schema](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1). Each link click event captures the link’s href attribute. The event also has fields for the link’s id, classes, and target (where the linked document is opened, such as a new tab or new window). Link click events are automatically tracked once configured. ## Install plugin JavaScript (tag): | Tracker Distribution | Included | | -------------------- | -------- | | `sp.js` | ✅ | | `sp.lite.js` | ❌ | Download: | | | | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | | Download from GitHub Releases (Recommended) | [Github Releases (plugins.umd.zip)](https://github.com/snowplow/snowplow-javascript-tracker/releases) | | Available on jsDelivr | [jsDelivr](https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-link-click-tracking@latest/dist/index.umd.min.js) (latest) | | Available on unpkg | [unpkg](https://unpkg.com/@snowplow/browser-plugin-link-click-tracking@latest/dist/index.umd.min.js) (latest) | Note: The links to the CDNs above point to the current latest version. You should pin to a specific version when integrating this plugin on your website if you are using a third party CDN in production. Browser (npm): - `npm install @snowplow/browser-plugin-link-click-tracking` - `yarn add @snowplow/browser-plugin-link-click-tracking` - `pnpm add @snowplow/browser-plugin-link-click-tracking` * ## Toggle link click tracking Turn on link click tracking like this: JavaScript (tag): ```javascript window.snowplow('addPlugin', "https://cdn.jsdelivr.net/npm/@snowplow/browser-plugin-link-click-tracking@latest/dist/index.umd.min.js", ["snowplowLinkClickTracking", "LinkClickTrackingPlugin"] ); snowplow('enableLinkClickTracking'); ``` Browser (npm): Initialize your tracker with the plugin. ```javascript import { newTracker } from '@snowplow/browser-tracker'; import { LinkClickTrackingPlugin, enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; newTracker('sp1', '{{collector_url}}', { appId: 'my-app-id', plugins: [ LinkClickTrackingPlugin() ], }); enableLinkClickTracking(); ``` * Use this method once and the tracker will add click event listeners to the document to detect clicks on anchor elements. An optional, but recommended, parameter is `pseudoClicks`. If this isn't turned on, Firefox won't recognize middle clicks. However, when configured, there is a small possibility of false positives (click events firing when they shouldn't). JavaScript (tag): ```javascript snowplow('enableLinkClickTracking', { pseudoClicks: true }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; enableLinkClickTracking({ pseudoClicks: true }); ``` * This is its signature (where `?` is an optional property): JavaScript (tag): ```javascript snowplow('enableLinkClickTracking', { options?: FilterCriterion, pseudoClicks?: boolean, trackContent?: boolean context?: SelfDescribingJson[] }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; enableLinkClickTracking({ options?: FilterCriterion, pseudoClicks?: boolean, trackContent?: boolean context?: SelfDescribingJson[] }); ``` * To stop tracking link events, call `disableLinkClickTracking`: JavaScript (tag): ```javascript snowplow('disableLinkClickTracking'); ``` Browser (npm): ```javascript import { disableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; disableLinkClickTracking(); ``` * ## Refresh link click tracking In previous versions, the `enableLinkClickTracking` method only tracked clicks on links that existed in the document at the time it was called. If new links were added to the page after that, you had to use `refreshLinkClickTracking` to add Snowplow click listeners to any new links. From v4, this method is deprecated and has no effect; event listeners are now added directly to the document rather than to individual link elements and new links should automatically be tracked with no action required. ## Configuration Control which links to track using the FilterCriterion object. Where FilterCriterion is an object: ```javascript interface FilterCriterion { / A collection of class names to include */ allowlist?: string[]; /** A collector of class names to exclude */ denylist?: string[]; /** A callback which returns a boolean as to whether the element should be included / filter?: (elt: HTMLElement) => boolean; } ``` You can control which links are tracked using the second argument. There are three ways to do this: a denylist, an allowlist, and a filter function. ### Denylist This is an array of CSS classes which should be ignored by link click tracking. For example, the below code will stop link click events firing for links with the class "barred" or "untracked", but will fire link click events for all other links: JavaScript (tag): ```javascript snowplow('enableLinkClickTracking', { options: { denylist: ['barred', 'untracked'] } }); // If there is only one class name you wish to deny, // you should still put it in an array snowplow('enableLinkClickTracking', { options: { 'denylist': ['barred'] } }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; enableLinkClickTracking({ options: { denylist: ['barred', 'untracked'] } }); // If there is only one class name you wish to deny, // you should still put it in an array enableLinkClickTracking({ options: { 'denylist': ['barred'] } }); ``` ### Allowlist The opposite of a denylist. This is an array of the CSS classes of links which you do want to be tracked. Only clicks on links with a class in the list will be tracked. JavaScript (tag): ```javascript snowplow('enableLinkClickTracking', { options: { 'allowlist': ['unbarred', 'tracked'] } }); // If there is only one class name you wish to allow, // you should still put it in an array snowplow('enableLinkClickTracking', { options: { 'allowlist': ['unbarred'] } }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; enableLinkClickTracking({ options: { 'allowlist': ['unbarred', 'tracked'] } }); // If there is only one class name you wish to allow, // you should still put it in an array enableLinkClickTracking({ options: { 'allowlist': ['unbarred'] } }); ``` * ### Filter function You can provide a filter function which determines which links should be tracked. The function should take one argument, the link element, and return either 'true' (in which case clicks on the link will be tracked) or 'false' (in which case they won't). The following code will track clicks on those and only those links whose id contains the string "interesting": JavaScript (tag): ```javascript function myFilter (linkElement) { return linkElement.id.indexOf('interesting') > -1; } snowplow('enableLinkClickTracking', { options: { 'filter': myFilter } }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; function myFilter (linkElement) { return linkElement.id.indexOf('interesting') > -1; } enableLinkClickTracking({ options: { 'filter': myFilter } }); ``` * Another optional parameter is `trackContent`. Set it to `true` if you want link click events to capture the innerHTML of the clicked link: JavaScript (tag): ```javascript snowplow('enableLinkClickTracking', { trackContent: true }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; enableLinkClickTracking({ trackContent: true }); ``` * The innerHTML of a link is all the text between the `a` tags. Note that if you use a base 64 encoded image as a link, the entire base 64 string will be included in the event. Each link click event will include (if available) the destination URL, id, classes and target of the clicked link. (The target attribute of a link specifies a window or frame where the linked document will be loaded.) Context `enableLinkClickTracking` can also be passed an array of custom context entities to attach to every link click event as an additional final parameter. Link click tracking supports dynamic context entities. Callbacks passed in the context argument will be evaluated with the source element passed as the only argument. The self-describing JSON context object returned by the callback will be sent with the link click event. A dynamic context could therefore look something like this for link click events: JavaScript (tag): ```javascript let dynamicContext = function (element) { // perform operations here to construct the context return context; }; snowplow('enableLinkClickTracking', { context: [dynamicContext] }); ``` Browser (npm): ```javascript import { enableLinkClickTracking } from '@snowplow/browser-plugin-link-click-tracking'; let dynamicContext = function (element) { // perform operations here to construct the context return context; }; enableLinkClickTracking({ context: [ dynamicContext ] }); ``` * See [this page](/docs/sources/web-trackers/custom-tracking-using-schemas/) for more information about tracking context entities. ## Manual link click tracking You can manually track individual link click events with the `trackLinkClick` method. You do not need to call `enableLinkClickTracking` before using this method. This is its signature: JavaScript (tag): ```javascript snowplow('trackLinkClick, { / The target URL of the link */ targetUrl: string; /** The ID of the element clicked if present */ elementId?: string; /** An array of class names from the element clicked */ elementClasses?: Array; /** The target value of the element if present */ elementTarget?: string; /** The content of the element if present and enabled */ elementContent?: string; }); ``` Browser (npm): ```javascript import { trackLinkClick } from '@snowplow/browser-plugin-link-click-tracking'; trackLinkClick({ /** The target URL of the link */ targetUrl: string; /** The ID of the element clicked if present */ elementId?: string; /** An array of class names from the element clicked */ elementClasses?: Array; /** The target value of the element if present */ elementTarget?: string; /** The content of the element if present and enabled / elementContent?: string; }); ``` Of these arguments, only `targetUrl` is required. This is how to use `trackLinkClick`: JavaScript (tag): ```javascript snowplow('trackLinkClick', { targetUrl: 'http://www.example.com', elementId: 'first-link', elementClasses: ['class-1', 'class-2'], elementTarget: '', elementContent: 'this page' }); ``` Browser (npm): ```javascript import { trackLinkClick } from '@snowplow/browser-plugin-link-click-tracking'; trackLinkClick({ targetUrl: 'http://www.example.com', elementId: 'first-link', elementClasses: ['class-1', 'class-2'], elementTarget: '', elementContent: 'this page' }); ``` * Rather than specify the values explicitly, you may also supply the link element directly (and optionally control whether to include the content or not): JavaScript (tag): ```javascript snowplow('trackLinkClick', { element: document.links[0], trackContent: false }); ``` Browser (npm): ```javascript import { trackLinkClick } from '@snowplow/browser-plugin-link-click-tracking'; trackLinkClick({ element: document.links[0], trackContent: false }); ``` *** --- # HTML5 media tracking on web > Automatically track HTML5 video and audio elements with media events including play, pause, seek, buffer, and progress milestones. > Source: https://docs.snowplow.io/docs/sources/web-trackers/tracking-events/media/html5/ This plugin enables the automatic tracking of HTML5 media elements (`