Manage schemas using Iglu Server
To manage your schemas, you will need to have an Iglu Server installed (you will already have one if you followed the Snowplow Self-Hosted Quick Start).
Alternatively, you can host a static Iglu registry in Amazon S3 or Google Cloud Storage.
Create a schema
First, design the schema for your custom event (or entity). For example:
{
"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"description": "Schema for a button click event",
"self": {
"vendor": "com.snowplowanalytics",
"name": "button_click",
"format": "jsonschema",
"version": "1-0-0"
},
"type": "object",
"properties": {
"id": {
"type": "string",
"minLength": 1
},
"target": {
"type": "string"
},
"content": {
"type": "string"
}
},
"required": ["id"],
"additionalProperties": false
}
Next, save this schema in the following folder structure, with a filename of 1-0-0 (without any extension):
schemas
└── com.snowplowanalytics
└── button_click
└── jsonschema
└── 1-0-0
If you update the vendor or the name in the example, you should update the above path too.
Finally, to upload your schema to your Iglu registry, you can use igluctl:
igluctl static push --public <local path to schemas> <Iglu server endpoint> <iglu_super_api_key>
See the Igluctl reference page for more information on the static push command.
Versioning schemas
When evolving your schema and uploading it to your Iglu Server, you will need to choose how to increment its version.
There are two kinds of schema changes:
- Non-breaking - a non-breaking change is backward compatible with historical data and increments the
patchnumber i.e.1-0-0->1-0-1, or the middle digit i.e.1-0-0->1-1-0. - Breaking - a breaking change is not backwards compatible with historical data and increments the
modelnumber i.e.1-0-0->2-0-0.
Different data warehouses handle schema evolution slightly differently. Use the table below as a guide for incrementing the schema version appropriately.
| Redshift | Snowflake, BigQuery, Databricks | |
|---|---|---|
| Add / remove / rename an optional field | Non-breaking | Non-breaking |
| Add / remove / rename a required field | Breaking | Breaking |
| Change a field from optional to required | Breaking | Breaking |
| Change a field from required to optional | Breaking | Non-breaking |
| Change the type of an existing field | Breaking | Breaking |
| Change the size of an existing field | Non-breaking | Non-breaking |
In Redshift and Databricks, changing size may also mean type change. For example, changing the maximum integer from 30000 to 100000. See our documentation on how schemas translate to database types.