IP lookup enrichment
This enrichment uses MaxMind databases to look up useful data based on the IP address collected by your Snowplow tracker(s).
When a user browses your site or app their IP address is collected. MaxMind maintains databases of additional points of information like geographic location, second level domain names (acme.com), Internet Service Provider, organization name and several other data points publicly associated with a given IP address.
The IP lookup enrichment uses MaxMind databases in order to take the IP address collected and add additional data points to every event generated by the user with a given IP address.
Some of the databases MaxMind maintains require a commercial subscription with MaxMind.
Select the MaxMind databases
MaxMind offers a free tier and a paid tier of databases, which can be used with Snowplow.
From the free tier you can provide two databases to Snowplow:
- GeoLite2 City Database, which contains geographic information (e.g. country) by IP address
- GeoLite2 ASN Database (supported starting with Enrich version 6.7.0), which contains autonomous system numbers by IP address
From the paid tier you can provide four databases to Snowplow:
- GeoIP2 City, which also contains geographic information, but with more precision and coverage than the GeoLite2 City Database
- GeoIP2 ISP, which contains information about the ISP serving that IP, and a more complete ASN mapping compared to the GeoLite2 ASN Database
- GeoIP2 Domain, which contains information about the domain at that IP address
- GeoIP2 Connection Type, which contains information about the connection type at that IP address.
You need to decide which of the different Maxmind databases listed above you wish to enrich your data with.
Host the databases in your cloud
If you use Snowplow CDI, the free-tier MaxMind database files are already provided and updated by Snowplow, so you don't need this step.
There is a pre-configured URI for the directory containing these files. You can find it in the default enrichment configuration in Console.
Once downloaded, take the .mmdb file(s) and upload them to a location on your cloud:
- Amazon S3 (if running Snowplow on AWS) e.g.
s3://my-private-bucket/third-party/maxmind - Azure ADLS (if running Snowplow on Azure) e.g.
https://my-private-storage-container.dfs.core.windows.net/third-party/maxmind - Google Cloud Storage (if running Snowplow on GCS) e.g.
gs://my-private-bucket/third-party/maxmind
When the database(s) need updating in future you can download the latest version and overwrite this file in your storage.
MaxMind also offer a method to download and update their databases programmatically.
Configure the enrichment
Unsure if your enrichment configuration is correct or works as expected? You can easily test it using Snowplow Micro, either through Console or on your machine.
Note that to test this enrichment, you will need events with realistic IP addresses (not local ones like 192.168.0.42).
This is not an issue if running Micro through Console, but will be trickier if running locally.
If you are using a web browser to test your site or app, you can spoof a specific IP address using a browser plugin that sets an X-Forwarded-For header. For example, here are plugins for Chrome and Firefox. Install the plugin and set the IP address to your liking.
Alternatively, you can set up Micro to receive external IP addresses.
There are five possible fields you can add to the “parameters” section of the enrichment configuration JSON: “geo”, “isp”, “domain”, “connectionType”, and “asn”:
- The
databasefield contains the name of the MaxMind database file. - The
urifield contains the URI of the bucket in which the database file is found. This can have eitherhttp:ors3:orgs:as the scheme and must not end with a trailing slash.
Allowed database filenames are as follows. If the file name you provide is not one of these, the enrichment JSON will fail validation.
| ENRICHMENT PARAMETER | VALID DATABASE NAMES |
|---|---|
geo | "GeoLite2-City.mmdb" (free) "GeoIP2-City.mmdb" (paid) |
isp | "GeoIP2-ISP.mmdb" |
domain | "GeoIP2-Domain.mmdb" |
connectionType | "GeoIP2-Connection-Type.mmdb" |
asn | "GeoLite2-ASN.mmdb" |
For a full reference of the options, see the configuration schema.
Example configurations
Note that you will need to change the uri values in these examples:
- If you are using Snowplow CDI and the free MaxMind databases, use the same default value you can see in Console for all
urifields (whethergeo,asn, etc). - Otherwise, provide the location (e.g.
s3://my-private-bucket/third-party/maxmind) where you uploaded the files.
You can remove the configuration keys for the databases you don't wish to use.
Free MaxMind tier
{
"schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/2-0-1",
"data": {
"name": "ip_lookups",
"vendor": "com.snowplowanalytics.snowplow",
"enabled": true,
"parameters": {
"geo": {
"database": "GeoLite2-City.mmdb",
"uri": "..."
},
"asn": {
"database": "GeoLite2-ASN.mmdb",
"uri": "..."
}
}
}
}
Paid MaxMind tier
{
"schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/2-0-1",
"data": {
"name": "ip_lookups",
"vendor": "com.snowplowanalytics.snowplow",
"enabled": true,
"parameters": {
"geo": {
"database": "GeoIP2-City.mmdb",
"uri": "..."
},
"isp": {
"database": "GeoIP2-ISP.mmdb",
"uri": "..."
},
"domain": {
"database": "GeoIP2-Domain.mmdb",
"uri": "..."
},
"connectionType": {
"database": "GeoIP2-Connection-Type.mmdb",
"uri": "..."
}
}
}
}
Output
This enrichment populates atomic table fields prefixed with geo_ and ip_.
ASN data
Starting with Enrich 6.7.0, the enrichment supports ASN information, which is useful for detecting bot traffic coming from cloud computing providers. To enable this, you need to provide either the ISP or the free ASN database through the isp or asn setting respectively.
If ASN data is available for a given IP address, the enrichment adds a derived entity to the enriched event with the asn schema.
Here is an example of an entity attached by this enrichment:
{
"schema": "iglu:com.snowplowanalytics.snowplow/asn/jsonschema/1-0-1",
"data": {
"number": 16509,
"organization": "Amazon.com, Inc."
}
}