Skip to main content

ASN lookup enrichment

Availability

This enrichment is available since version 6.9.0 of Enrich.

This enrichment checks the autonomous system number (ASN) attached to an event against a configurable list of ASNs associated with bots, cloud providers, or abusive networks. When a match is found, the enrichment sets likelyBot to true on the ASN entity added by the IP lookup enrichment.

This is useful for automatically flagging non-human traffic. Many bots and scrapers originate from well-known cloud hosting or data center ASNs, and community-maintained lists such as cpuchain/bad-asn-list track these.

Prerequisite

This enrichment requires the IP lookup enrichment to be enabled with ASN data (either the free GeoLite2 ASN database or the paid GeoIP2 ISP database). It runs immediately after IP lookup and reads the ASN entity that IP lookup produces.

Configuration

Testing with Micro

Unsure if your enrichment configuration is correct or works as expected? You can easily test it using Snowplow Micro, either through Console or on your machine.

The enrichment accepts three optional parameters:

ParameterTypeDescription
botAsnsFileobjectA reference to a remotely hosted CSV file containing bad ASNs.
botAsnsarrayAn inline list of ASN entries to treat as bots.
bypassPlatformsarray of stringsPlatform codes for which the enrichment is skipped.

botAsnsFile

Points to a CSV file with ASN numbers. The file should have a header row and use the format number,name (e.g., 174,"COGENT-174 - Cogent Communications, US"). Only the number column is used for matching; the name column is for human readability and can be empty.

FieldTypeDescription
uristringBase URI where the file is hosted. Supports http:, s3:, and gs: schemes. Must not end with a trailing slash.
databasestringThe CSV filename.

botAsns

An inline array of ASN objects. These are combined with any entries from botAsnsFile.

FieldTypeRequiredDescription
asnintegerYesThe autonomous system number.
namestringNoA human-readable label. Used only for clarity in the configuration file.

bypassPlatforms

An array of values that correspond to the platform field on events. Events with a matching platform skip this enrichment entirely, because it is expected for those platforms to originate from cloud or data center ASNs.

For example, server-side tracking ("srv") and IoT ("iot") events typically come from cloud providers, so flagging them as bots would produce false positives.

Example configuration

json
{
"schema": "iglu:com.snowplowanalytics.snowplow/asn_lookups/jsonschema/1-0-0",
"data": {
"name": "asn_lookups",
"vendor": "com.snowplowanalytics.snowplow.enrichments",
"enabled": true,
"parameters": {
"botAsnsFile": {
"uri": "s3://my-private-bucket/third-party/bad-asn",
"database": "bad-asn-list.csv"
},
"botAsns": [
{
"asn": 123,
"name": "ASN 123"
},
{
"asn": 456
}
],
"bypassPlatforms": ["srv", "iot"]
}
}
}
Hosting the bot ASN list

You can use the community-maintained cpuchain/bad-asn-list as a starting point for botAsnsFile. Host the CSV file in your own cloud storage to avoid depending on an external service at pipeline runtime.

The enrichment modifies the existing ASN entity to add a bot signal when a match is found.

Output

This enrichment modifies the ASN entity (iglu:com.snowplowanalytics.snowplow/asn/jsonschema/1-0-1) that the IP lookup enrichment attaches to events where ASN information is available. When the event's ASN matches an entry in the bot list, the enrichment sets likelyBot to true:

json
{
"schema": "iglu:com.snowplowanalytics.snowplow/asn/jsonschema/1-0-1",
"data": {
"number": 16509,
"organization": "Amazon.com, Inc.",
"likelyBot": true
}
}

If the ASN does not match any entry in the bot list, or the event's platform is in bypassPlatforms, the likelyBot field is not added to the entity.

FieldTypeDescription
likelyBotbooleanSet to true or false for whether the ASN matches the bot list. Absent if the enrichment is skipped due to bypassPlatforms.

On this page

Want to see a custom demo?

Our technical experts are here to help.