Configuring enrichments
By default, Micro does not come with any enrichments enabled — this helps us keep the docker image smaller. You can enable any enrichments you like by passing corresponding configuration files to Micro.
Limitations for enrichments that rely on data files
Some enrichments require data files (e.g. a database of IPs).
The Enrich application in a full Snowplow pipeline will automatically download and periodically update these files. However, Micro will only download them once. You can always restart Micro to get a fresher copy of the files.
Also, the Enrich application supports files located in S3 and GCS with the s3://
and gs://
schemes respectively. Micro currently only supports http://
and https://
. You can often rewrite the URL to make it work:
s3://my-bucket/x/y
→https://my-bucket.s3.amazonaws.com/x/y
gs://my-bucket/x/y
→https://storage.googleapis.com/my-bucket/x/y
For example, let’s say that you want to configure the IP Lookup enrichment. The default configuration file looks like this:
loading...
Put this file somewhere on the machine where you are running Micro, let’s say my-enrichments/ip_lookups.json
. (Feel free to add any other configurations to my-enrichments
).
Now you will need to pass this directory to the Docker container (using a bind mount):
docker run -p 9090:9090 \
--mount type=bind,source=$(pwd)/my-enrichments,destination=/config/enrichments \
snowplow/snowplow-micro:2.1.3
The directory inside the container (what goes after destination=
) must be exactly /config/enrichments
.
Once Micro starts, you should see messages like these:
[INFO] com.snowplowanalytics.snowplow.micro.Run - Downloading http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind/GeoLite2-City.mmdb...
[INFO] com.snowplowanalytics.snowplow.micro.Run - Enabled enrichments: IpLookupsEnrichment
Micro is especially great for testing the JavaScript enrichment.