Understanding events
An event is something that occurred at a particular point in time. Examples of events include:
- Load a web page
- Add an item to basket
- Enter a destination
- Check a balance
- Search for an item
- Share a video
Kinds of events
At the high level, there are 3 kinds of Snowplow events:
- Standard events that are very common and therefore “baked in”, for example, page views
- Structured events, which you can supplement with some (limited) custom information
- Self-describing events, which can include arbitrarily complex data, described by a schema
You can create all of these by using various tracking SDKs.
In the data warehouse, all 3 kinds of events share a number of standard columns, such as timestamps. That said, event-specific data will be stored differently, as explained below. See also what Snowplow data looks like.
Out-of-the-box and custom events
Snowplow supports a large number of events out of the box, for example:
- Page views and screen views
- Page pings
- Link clicks
- Form fill-ins (for the web)
- Form submissions
- Transactions
Some of these are baked-in events, while others are self-describing events that were predefined by the Snowplow team, e.g. link clicks. Tracking SDKs usually provide dedicated API to create out-of-the-box events (regardless of their kind).
You can also create custom events to match your business requirements. For that purpose, you can either define your own self-describing events (recommended), or use structured events.
Out-of-the-box events | Custom events |
---|---|
Baked-in events | Structured events |
Self-describing events (predefined by Snowplow) | Self-describing events (defined by you) |
Baked-in events
The following events are “baked in”. They get special treatment because they are very common:
- Page views (
page_view
) - Page pings (
page_ping
) - E-commerce transactions (
transaction
andtransaction_item
)
The transaction
and transaction_item
events are not very convenient to use and exist mostly for legacy reasons. One of their significant downsides is that you have to send a separate event for the transaction itself and then an event for each of the order items in that transaction (as opposed to including all items in a single event).
Over the years, it has become more idiomatic to use entities for order items in e-commerce transactions. For instance, our E-commerce Accelerator uses this approach.
Tracking and storage format
Snowplow tracking SDKs provide a dedicated API for these events. For example, if you want to track a page view using the JavaScript tracker:
window.snowplow('trackPageView');
In the data warehouse, any event-specific information for these events will be in standard columns (in the Snowplow events
table). You can find those listed here.
Structured events
We recommend using self-describing events instead of structured events whenever possible. While structured events are simpler to create (as you don’t need to define a schema), they have a number of disadvantages:
Structured events | Self-describing events | |
---|---|---|
Format | ❌ Data must fit the 5 fields below | ✅ JSON, as complex as you want |
Validation | ❌ No validation (beyond field types) | ✅ Schema includes validation criteria |
Meaning | ❌ Can only infer what each field represents | ✅ Schema includes field descriptions |
Structured events have 5 fields:
- Category: The name for the group of objects you want to track
- Action: A string that is used to define the user action for the category of object
- Label: An optional string which identifies the specific object being actioned
- Property: An optional string describing the object or the action performed on it
- Value: An optional numeric data to quantify or further describe the user action
Tracking and storage format
To track a structured event, use one of the tracking SDKs. For example, with the JavaScript tracker:
snowplow('trackStructEvent', {
category: 'Product',
action: 'View',
label: 'ASO01043',
property: 'Dress',
value: 49.95
});
In the data warehouse, these events still use the standard columns for general information, like timestamps. In addition, the above fields for all structured events are stored in a set of 5 standard columns. See the structure of Snowplow data for more information.
Self-describing events
In the past, self-describing events used to be called “unstructured events”, to distinguish them from structured events. However, this was misleading, because in a way, these events are actually more structured than structured events 🤯. The old term is now deprecated, but you might still see it in some docs, APIs and database column names.
Self-describing events can include arbitrarily complex data, as defined by the event’s schema. We call them “self-describing” because these events include a reference to their schema.
Because the event references its schema (in a particular version!), it’s always clear to the downstream users and applications what each field in the event means, even if your definition of the event changes over time.
Each self-describing event consists of two parts:
- A reference to a schema that describes the name, version and structure of the event
- A set of key-value properties in JSON format — the data associated with the event
This structure is an example of what we call self-describing JSON — a JSON object with a schema
and a data
field.
Tracking and storage format
Some self-describing events were predefined by Snowplow and are natively supported by tracking SDKs. For example, the mobile trackers automatically send screen view self-described events. You can find the schemas for these events here.
To track your own custom self-describing event, e.g. viewed_product
, you will first need to define its schema (see managing data structures). This schema might have fields such as productId
, brand
, etc.
Then you can use one of our tracking SDKs. For example, with the JavaScript tracker:
window.snowplow('trackSelfDescribingEvent', {
event: {
schema: 'iglu:com.acme_company/viewed_product/jsonschema/2-0-0',
data: {
productId: 'ASO01043',
category: 'Dresses',
brand: 'ACME',
returning: true,
price: 49.95,
sizes: ['xs', 's', 'l', 'xl', 'xxl'],
availableSince: new Date(2013,3,7)
}
}
});
In the data warehouse, these events still use the standard columns for general information, like timestamps. In addition, each type of self-describing event gets its own column (or its own table, in the case of Redshift) for event-specific fields defined in its schema. See the structure of Snowplow data for more information.