Skip to main content

Referrer parser enrichment

This enrichment uses the Snowplow referer-parser library to extract attribution data from referrer URLs.

This is particularly useful when looking for traffic from specific search engine providers or social networks.

Configuration

The enrichment takes these parameters:

ParameterRequiredDescription
internalDomainsSubdomains to classify as Internal traffic sources.
databaseFilename of the referer-parser database. Already provided for CDI customers.
uriURI of the bucket containing the database file. Already provided for CDI customers.
referrersCustom referrer-to-category mappings, taking precedence over the database.

Configure the parameters in the Console enrichment editor. Keep the Console default for the database and uri fields. For example:

json
{
"internalDomains": [],
"database": "<use default value from Console>",
"uri": "<use default value from Console>",
"referrers": {
"search": {
"Search website 1": {
"domains": ["search.acme.com"],
"parameters": ["q"]
},
"Search website 2": {
"domains": ["search.acmebis.com"]
}
},
"social": {
"Social website": {
"domains": ["social.acme.com"]
}
}
}
}
Testing with Micro

Unsure if your enrichment configuration is correct or works as expected? You can easily test it using Snowplow Micro, either through Console or on your machine.

internalDomains

Use this property to specify a list of subdomains to class as Internal traffic sources.

json
"internalDomains": [
"community.snowplow.io",
"docs.snowplow.io"
],
note

The enrichment will also classify refr_medium as Internal when an event's page_urlhost matches its refr_urlhost, regardless of the configured internalDomains.

This behavior isn't configurable, and may require handling in data models or a JavaScript enrichment to change.

database and uri

Snowplow CDI

If you're using Snowplow CDI, you don't need to configure these. Use the default values provided in Console.

Provide details of the referer-parser format database to use. Snowplow hosts a database you can use: the latest version is listed in the library README. Alternatively, the enrichment will accept any valid JSON or YAML file in the right format.

Custom referrer mappings

Availability

This feature is available since version 6.9.0 of Enrich.

You can add your own referrer-to-category mappings directly in the enrichment configuration using the optional referrers parameter. This is useful when you need to classify new traffic sources - such as internal tools, niche search engines, or AI chatbots - without waiting for changes to the upstream database.

Custom mappings take precedence over the default database. If a domain appears in both your custom mappings and the default database, the custom mapping is used.

The referrers parameter is a nested object structured like this:

json
"referrers": {
"<medium>": {
"<source name>": {
"domains": ["<domain1>", "<domain2>"],
"parameters": ["<param1>"]
}
}
}
FieldDescription
<medium>The referrer category e.g., search, social, email. This value populates refr_medium.
<source name>A human-readable name for the source e.g., "Google", "Internal Search". This value populates refr_source.
domainsAn array of hostnames to match against the referrer URL. At least one domain is required.
parametersAn optional array of URL query parameter names to extract search terms from. Matched values populate refr_term.

For example, to classify a custom search engine and a social network:

json
"referrers": {
"search": {
"Corporate Search": {
"domains": ["search.example.com"],
"parameters": ["q", "query"]
}
},
"social": {
"Internal Forum": {
"domains": ["forum.example.com"]
}
}
}

With this configuration, a referrer URL of https://search.example.com/?q=snowplow would produce the following:

FieldValue
refr_mediumsearch
refr_sourceCorporate Search
refr_termsnowplow
Contributing mappings upstream

You can use custom referrer mappings to immediately test new categorizations in your pipeline. Once validated, consider contributing your mappings back to the upstream referer-parser database via a pull request.

Output

This enrichment populates the refr_medium, refr_source, and refr_term atomic event fields.

On this page

Want to see a custom demo?

Our technical experts are here to help.