Skip to main content

Defining attributes

Attributes are defined as part of attribute groups. To create an attribute, you'll need to set:

  • A name, ideally one that describes the attribute
  • Which event schema to calculate it from
  • What property in the schema to consider for the calculation
  • What kind of aggregation you want to calculate over time, e.g. mean or last
  • What type of value you want the attribute to hold, e.g. double or string

Attribute calculation starts when the definitions are published, and aren't backdated.

Minimal example

This is the minimum configuration needed to create an attribute:

from snowplow_signals import Attribute, Event

my_attribute = Attribute(
name="button_click_counter",
type="int32",
events=[
Event(
vendor="com.snowplowanalytics.snowplow",
name="button_click",
version="1-0-0",
)
],
aggregation="counter"
)

Once applied and active, this attribute definition will trigger every time Signals processes an event with the schema iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0. Starting from 0, the stored attribute will be an integer value that increases by 1 with every button_click event.

Options

The table below lists all available arguments for an Attribute:

ArgumentDescriptionTypeRequired?
nameThe name of the attributestring
descriptionThe description of the attributestring
eventsList of Snowplow Events that the attribute is calculated onList of Event type
aggregationThe calculation to be performedOne of: counter, sum, min, max, mean, first, last, unique_list
typeThe type of the aggregation resultOne of: bytes, string, int32, int64, double, float, bool, unix_timestamp, bytes_list, string_list, int32_list, int64_list, double_list, float_list, bool_list, unix_timestamp_list,
criteriaList of Criteria to filter the eventsList of Criteria type
propertyThe property of the event or entity you wish to use in the aggregationstring
periodThe time period window over which the aggregation should be calculatedPython timedelta
default_valueThe default value to use if the aggregation returns no results. If not set, the default value is automatically assigned based on the type.

Specifying events

The events list describes the types of events that the attribute is calculated from. They're references to Snowplow events that exist in your Snowplow account, based on event schemas.

An Event accepts the following parameters:

ArgumentDescriptionType
nameevent_name column in atomic.events tablestring
vendorevent_vendor column in atomic.events tablestring
versionevent_version column in atomic.events tablestring

Use the following details for Snowplow page view, page ping, or structured events:

# Page view event
sp_page_view=Event(
name="page_view",
vendor="com.snowplowanalytics.snowplow",
version="1-0-0"
)

# Page ping event
sp_page_ping=Event(
name="page_ping",
vendor="com.snowplowanalytics.snowplow",
version="1-0-0"
)

# Structured event
sp_structured=Event(
name="event",
vendor="com.google.analytics",
version="1-0-0"
)

All of these parameters are optional, and work like wildcards.

To calculate an attribute for version 2-0-2 only of an event data structure with the schema iglu:com.snowplowanalytics.snowplow.media/destination/jsonschema/2-0-2, the Event would be:

event=Event(
name="destination",
vendor="com.snowplowanalytics.snowplow.media",
version="2-0-2"
)

To calculate an attribute for all versions of that event:

event=Event(
name="destination",
vendor="com.snowplowanalytics.snowplow.media",
)

To calculate an attribute for all events for a specific vendor:

event=Event(
vendor="com.snowplowanalytics.snowplow.media",
)

Filtering events by specific values

The criteria list contains the filters used to calculate an attribute from the specified event.

It allows you to be specific about which subsets of events should trigger attribute updates. For example, instead of counting all page views in a user's session, you may wish to calculate only views for an FAQs page, or a "contact us" page.

The criteria list takes a Criteria type, with possible arguments:

ArgumentDescriptionType
allAll criterion conditions must be metlist of Criterion
anyAt least one of the criterion conditions must be metlist of Criterion

A Criterion specifies a filtering rule on the specified event.

When creating a Criterion, you can use its operator-like methods to define the filtering for a property of the event payload. The available methods are:

  • .eq checks equality similar to using the = operator
  • .neq checks non-equality similar to using the != operator
  • .gt checks if a property is greater than a value
  • .gte checks if a property is greater than or equal to a value
  • .lt checks if a property is less than a value
  • .lte checks if a property is less than or equal to a value
  • .like checks if a property matches a value using a LIKE operator
  • .in_list checks if a property is in the list of values using an IN operator
  • These methods accept the property as a first argument and then the value to filter against.
ArgumentDescriptionType
propertyThe property of the event payload you want the criterion to run againstAtomicProperty, EventProperty, or EntityProperty
valueThe value to compare the property toType-checked based on the operator

The AtomicProperty, EventProperty and EntityProperty are classes to help you target a property of an event to be filtered. Use:

  • AtomicProperty to target atomic properties in the event payload
  • EventProperty to target properties in the event data structure in the event payload
  • EntityProperty to target properties in the entity data structures in the event payload

Some examples:

# Targets the app_id atomic property in the event payload
AtomicProperty(name="app_id")

# Targets the action property of the com.example/test_event/jsonschema/1-*-*
EventProperty(
vendor="com.example",
name="test_event",
major_version=1,
path="action"
)

# Targets the age property of the first com.example/user_context/jsonschema/1-*-* entity
EntityProperty(
vendor="com.example",
name="user_context",
major_version=1,
path="age"
)

For a more complete example, if you want to calculate an attribute for page views of either the FAQs or "contact us" page, the Criteria could be:

criteria=Criteria(
any=[
Criterion.like(
AtomicProperty(name="page_url"),
"%/faq%"
),
Criterion.like(
AtomicProperty(name="page_url"),
"%/contact-us%"
)
]
)

The page_url property is from the built-in atomic event properties in all Snowplow events.

Extended examples

These examples use all the available configuration options.

Example 1

This example extends the previous minimal example. Now the attribute is only calculated for button_click events where the id property is equal to generate_emoji_btn.

from snowplow_signals import Attribute, Event, Criteria, Criterion, EventProperty
from datetime import timedelta

button_click_counter_attribute = Attribute(
name="emoji_button_click_counter",
description="The number of clicks for the 'generate emoji' button",
type="int32",
events=[
Event(
vendor="com.snowplowanalytics.snowplow",
name="button_click",
version="1-0-0",
)
],
aggregation="counter",
criteria=Criteria(
all=[
Criterion.eq(
EventProperty(
vendor="com.snowplowanalytics.snowplow",
name="button_click",
major_version=1,
path="id"
),
"generate_emoji_btn"
)
]
),
period=timedelta(minutes=10),
default_value=0
)

The attribute will be calculated over the last 10 minutes as a rolling window.

Example 2

Here's a new example showing how to use the property option to access one of the atomic event properties, in this case mkt_medium.

In this example the attribute is calculated from either a page view or a custom event:

from snowplow_signals import Attribute, Event

referrer_source_attribute = Attribute(
name="referrer_source",
description="Referrer",
type="string",
events=[
Event(
name="page_view",
vendor="com.snowplowanalytics.snowplow",
version="1-0-0"
),
Event(
name="login_landing",
vendor="com.business.example",
version="1-0-0"
)
],
aggregation="last",
criteria=None,
property=AtomicProperty(name="mkt_medium"),
period=None,
default_value=None
)

This attribute will be updated to the most recent referrer URL every time a page view or login_landing event is processed.

Example 3

This example shows how to use the property option to access values in any part of a tracked event.

Tthe attribute is based on product prices, tracked within the product entity in an ecommerce transaction event:

from snowplow_signals import Attribute, Event, Criteria, Criterion, EventProperty

my_new_attribute = Attribute(
name="products_total_purchase_value",
description="Total purchase value for all products",
type="int64",
events=[
Event(
vendor="com.snowplowanalytics.snowplow.ecommerce",
name="snowplow_ecommerce_action",
version="1-0-2",
)
],
aggregation="sum",
criteria=Criteria(
all=[
Criterion.eq(
EntityProperty(
vendor="com.snowplowanalytics.snowplow.ecommerce",
name="product",
major_version=1,
index=[0],
path="price"
),
"transaction"
)
]
),
property=EntityProperty(
vendor="com.snowplowanalytics.snowplow.ecommerce",
name="product",
major_version=1,
path="price",
),
period=None,
default_value=0
)

This attribute will be calculated for Snowplow ecommerce events with schema iglu:com.snowplowanalytics.snowplow.ecommerce/snowplow_ecommerce_action/jsonschema/1-0-2 and the type property transaction. The products in a transaction are stored as product entities in the contexts_com_snowplowanalytics_snowplow_ecommerce_product_1 array. The price property of the first product is used to calculate the attribute.