Skip to main content

ASN lookup enrichment

Availability

This enrichment is available since version 6.9.0 of Enrich.

This enrichment checks the autonomous system number (ASN) attached to an event against a configurable list of ASNs associated with bots, cloud providers, or abusive networks. When a match is found, the enrichment sets likelyBot to true on the ASN entity added by the IP lookup enrichment.

This is useful for automatically flagging non-human traffic. Many bots and scrapers originate from well-known cloud hosting or data center ASNs, and community-maintained lists such as cpuchain/bad-asn-list track these.

VPN users

Many VPN services also use cloud hosting. As such, this enrichment might incorrectly flag VPN users as bots (hence the likelyBot and not bot designation).

Depending on the nature of your business, VPN users might represent a meaningful portion of your traffic. If you find that to be the case, you should not rely on this enrichment as a sole indicator of bots. Instead, you could use it to reinforce other indicators, e.g., unusually high number of page views in a short timeframe.

Prerequisites

To use this enrichment, you need to enable the IP lookup enrichment and configure it to produce ASN data.

If you are using the paid MaxMind database (isp field present in your IP lookup configuration), you don't need to do anything else.

If you are using the free MaxMind database, e.g., the one provided by Snowplow for CDI customers, your configuration will look like this:

json
"geo": {
"database": "GeoLite2-City.mmdb",
"uri": "<database URI>"
}

Add the asn field:

json
"geo": {
"database": "GeoLite2-City.mmdb",
"uri": "<database URI>"
},
"asn": {
"database": "GeoLite2-ASN.mmdb",
"uri": "<same URI value as for geo>"
}

Configuration

The enrichment takes these parameters:

ParameterRequiredDescription
botAsnsFileLocation of a CSV file listing ASNs to flag as likely bots. Already provided for CDI customers.
botAsnsInline list of ASNs to flag, merged with any entries from botAsnsFile.
bypassPlatformsEvent platforms for which the enrichment should not run (e.g. server-side, IoT).

Configure the parameters in the Console enrichment editor. Keep the Console defaults for the database and uri fields. For example:

json
{
"botAsnsFile": {
"uri": "<use default value from Console>",
"database": "<use default value from Console>"
},
"botAsns": [
{ "asn": 123, "name": "ASN 123" },
{ "asn": 456 }
],
"bypassPlatforms": ["srv"]
}
Testing with Micro

Unsure if your enrichment configuration is correct or works as expected? You can easily test it using Snowplow Micro, either through Console or on your machine.

botAsnsFile

Snowplow CDI

If you're using Snowplow CDI, you don't need to configure this. Use the default values provided in Console.

Points to a CSV file with ASN numbers. The file should have a header row and use the format number,name (e.g., 174,"COGENT-174 - Cogent Communications, US"). Only the number column is used for matching; the name column is for human readability and can be empty.

FieldTypeDescription
uristringBase URI where the file is hosted. Supports http:, s3:, and gs: schemes. Must not end with a trailing slash.
databasestringThe CSV filename.

You can use a community-maintained list such as cpuchain/bad-asn-list. Host the CSV file in your own cloud storage to avoid depending on an external service at pipeline runtime.

botAsns

An inline array of ASN objects. These are combined with any entries from botAsnsFile.

FieldTypeRequiredDescription
asnintegerThe autonomous system number.
namestringA human-readable label. Used only for clarity in the configuration file.

bypassPlatforms

An array of values that correspond to the platform field on events. Events with a matching platform skip this enrichment entirely, because it is expected for those platforms to originate from cloud or data center ASNs.

For example, server-side tracking ("srv") and IoT ("iot") events typically come from cloud providers, so flagging them as bots would produce false positives.

Output

The IP lookup enrichment adds an ASN entity to events where ASN information is available. This enrichment modifies that entity by setting likelyBot to true when the ASN matches one from the configured list of bad ASNs.

This enrichment won't produce any output if:

  • The IP lookup enrichment is not enabled
  • ASN data is not enabled in the IP lookup enrichment configuration
  • An event does not contain an IP address
  • There is no ASN information for that IP address

asn

Entity
Schema for ASN entity generated by IP lookup enrichment
Schema URIiglu:com.snowplowanalytics.snowplow/asn/jsonschema/1-0-1
Example data
json
{
"number": 16509,
"organization": "Amazon.com, Inc.",
"likelyBot": true
}
Properties and schema
PropertyDescription
number
integer
Required. The autonomous system number associated with the IP address
organization
string
Optional. The organization associated with the registered autonomous system number for the IP address
likelyBot
boolean
Optional. Set to true if the ASN belongs to hosting providers, data centers, etc.

The likelyBot field isn't included in the entity if the event's platform is in bypassPlatforms.

On this page

Want to see a custom demo?

Our technical experts are here to help.