Custom API request enrichment
The API request enrichment lets you add data to a Snowplow event via your own or third-party proprietary HTTP(S) API. Only basic access authentication is supported.
For example, use a common key like a user ID or an email address to add relevant information about a user to each event, before it gets written to your data store. The retrieved data is added as entities.
As with all enrichments, only one instance of it can be configured within your pipeline. This means you can only call one API during event processing.
Configuration
For historical reasons, the configuration uses terms that are no longer used elsewhere in Snowplow.
The enrichment takes these parameters:
| Parameter | Required | Description |
|---|---|---|
inputs | ✅ | Specifies the data points from the Snowplow event to use as keys when performing your API lookup. |
api | ✅ | Defines how the enrichment can access your API. |
outputs | ✅ | Specify how to process the returned JSON. |
cache | ✅ | Improves the enrichment's performance by storing values retrieved from the API. |
ignoreOnError | ❌ | Whether to make the event fail if the API request fails. |
- Console
- Self-Hosted
Configure the parameters in the Console enrichment editor. For example:
{
"inputs": [
{
"key": "user",
"json": {
"field": "contexts",
"schemaCriterion": "iglu:com.snowplowanalytics.snowplow/client_session/jsonschema/1-*-*",
"jsonPath": "$.userId"
}
},
{
"key": "client",
"pojo": {
"field": "app_id"
}
}
],
"api": {
"http": {
"method": "GET",
"uri": "http://api.acme.com/users/{{client}}/{{user}}?format=json",
"timeout": 2000,
"authentication": {
"httpBasic": {
"username": "xxx",
"password": "yyy"
}
}
}
},
"outputs": [
{
"schema": "iglu:com.acme/user/jsonschema/1-0-0",
"json": {
"jsonPath": "$.record"
}
}
],
"cache": {
"size": 3000,
"ttl": 60
},
"ignoreOnError": false
}
For Self-Hosted, provide a complete JSON. For example:
{
"schema": "iglu:com.snowplowanalytics.snowplow.enrichments/api_request_enrichment_config/jsonschema/1-0-2",
"data": {
"vendor": "com.snowplowanalytics.snowplow.enrichments",
"name": "api_request_enrichment_config",
"enabled": false,
"parameters": {
"inputs": [
{
"key": "user",
"json": {
"field": "contexts",
"schemaCriterion": "iglu:com.snowplowanalytics.snowplow/client_session/jsonschema/1-*-*",
"jsonPath": "$.userId"
}
},
{
"key": "client",
"pojo": {
"field": "app_id"
}
}
],
"api": {
"http": {
"method": "GET",
"uri": "http://api.acme.com/users/{{client}}/{{user}}?format=json",
"timeout": 2000,
"authentication": {
"httpBasic": {
"username": "xxx",
"password": "yyy"
}
}
}
},
"outputs": [
{
"schema": "iglu:com.acme/user/jsonschema/1-0-0",
"json": {
"jsonPath": "$.record"
}
}
],
"cache": {
"size": 3000,
"ttl": 60
},
"ignoreOnError": false
}
}
}
Unsure if your enrichment configuration is correct or works as expected? You can easily test it using Snowplow Micro, either through Console or on your machine.
inputs
The enrichment can use any property in the event as input data source. The input data can be extracted from:
- Atomic event properties such as
user_id - Self-describing event fields
- Entities attached by tracker SDKs
- Entities attached by other enrichments
The custom API enrichment runs after most other enrichments, so it can access data added by them. Only the IP anonymization and PII pseudonymization enrichments run after the custom API enrichment.
Specify an array of inputs to use as keys when performing your API lookup. Each input consists of a key and a source: pojo for atomic event fields, or json for JSON fields, whether event or entity.
Key names can contain only alphanumeric symbols, hyphens, and underscores.
For json, specify the field name as either unstruct_event for self-describing event fields, contexts for fields in entities added during tracking, or derived_contexts for fields in enrichment entities. Add two additional fields:
schemaCriterionis the Iglu schema URI. You can specify all versions of the schema (*-*-*), or a specific major version (e.g.1-*-*), major plus minor (e.g.1-1-*) or a full major-minor-patch version (e.g.1-1-1)jsonPathis the JSON Path statement to navigate to the field inside the JSON that you want to use as the input.
The resolved values should be primitive types (string, number, or boolean).
api
Configure the API access with api. The enrichment supports GET, POST, and PUT methods, and both HTTP and HTTPS protocols.
For the uri field, specify the full URI, including the protocol. You can attach a querystring to the end of the URI. You can also embed the keys from your inputs section in the URI, by wrapping the key in {{}} brackets:
"uri": "http://api.acme.com/users/{{client}}/{{user}}?format=json"
If a key required in the uri wasn't found in any of the inputs, then the lookup won't proceed, but this will not be flagged as a failure.
The only supported authentication option is httpBasic. Provide a username and/or password for the enrichment to connect to your API using basic access authentication. Some APIs use only the username or password field to contain an API key; in this case, set the other property to an empty string "".
If your API is unsecured (for example, because it's only accessible from inside your private subnet, or because you're using an IP address allowlist) then configure the authentication section like this:
"authentication": { }
outputs
This enrichment assumes that your API returns a JSON object containing one or more properties that you want to add to your event. You'll need to specify the schema or data structure that the enrichment should use to define the retrieved data.
Each entry in the outputs array needs two fields:
schemato specify the schema URI you want to attach to the event.jsonandjsonPathto specify which part of the returned JSON you want to add to the enriched event. Use$if you want to attach the returned JSON as is.
The outputs array must have at least one entry in it.
If the JSON path specified can't be found within the API response, then the lookup and the event will be flagged as a failure - unless ignoreOnError is set to true.
cache
An enrichment can run many millions of time per hour, effectively launching a DoS attack on a data source. The cache configuration attempts to minimize the number of lookups performed.
The cache is an LRU (least-recently used) cache, where less frequently accessed values are evicted to make space for new values. The uri with all keys populated is used as the key in the cache. Configure the cache as follows:
sizeis the maximum number of entries to hold in the cache at any one time. The minimum value is1.ttlis the number of seconds that an entry can stay in the cache before it is forcibly evicted. This is useful to prevent stale values from being retrieved in the case that your API can return different values for the same key over time.
ignoreOnError
When set to true, if the enrichment fails for any reason, the event is still considered successfully enriched. It'll be loaded as usual, except without the entities added by the enrichment.
When set to false, the event will become a failed event if the API call fails.
Edge case handling
This enrichment can use any third-party RESTful service to fetch data in JSON format. In most cases, we recommend using your own private server to maintain performance. Third-party services could cause slowdown of your enrichment process.
This table describes what will happen under different conditions:
| Scenario | Outcome |
|---|---|
| A provided JSONPath is invalid. | Failed event, unless ignoreOnError is set to true. |
| Any one of the input keys wasn't found. | The HTTP request won't be sent, and no entities will be added. The event will otherwise be processed as usual - not failed. |
More than one entity in the event matches the schemaCriterion. | The first matching entity found will be used. |
| Multiple inputs share the same key (don't do this). | The last input configured will be picked. |
| An input JSONPath matches a non-primitive value. | The enrichment will try to stringify it, likely resulting in an invalid URL. If so, it will cause a failed event, unless ignoreOnError is set to true. |
| The output's JSONPath wasn't found. | Failed event, unless ignoreOnError is set to true. |
| The response returned JSON which is not valid according to the output schema. | Failed event, unless ignoreOnError is set to true. |
| The server returned any non-successful response or timed-out. | Failed event, unless ignoreOnError is set to true. |
Output
This enrichment adds entities based on your configuration.
Example API response:
// GET http://api.acme.com/users/northwind-traders/123?format=json
{
"metadata": {
"whenCreated": 1448371243,
"whenUpdated": 1448373431
},
"record": {
"name": "Bob Thorpe",
"id": "123"
}
}
With this configuration:
"outputs": [ {
"schema": "iglu:com.acme/user/jsonschema/1-0-0",
"json": {
"jsonPath": "$.record"
}
} ]
The enrichment will add this entity to your event:
{
"schema": "iglu:com.acme/user/jsonschema/1-0-0",
"data": {
"name": "Bob Thorpe",
"id": "123"
}
}