JSON Schema reference
Snowplow schemas are based on the JSON Schema standard (draft 4). This reference provides comprehensive documentation for all JSON Schema features that are supported in Snowplow.
Understanding the full capabilities of JSON Schema allows you to create more precise and robust data structures that ensure your data quality and provide clear documentation for your tracking implementation.
Schema structure
Every Snowplow schema must follow this basic structure with Snowplow-specific metadata and JSON Schema validation rules:
{
"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"description": "Human-readable description of the schema purpose",
"self": {
"vendor": "com.example",
"name": "schema_name",
"format": "jsonschema",
"version": "1-0-0"
},
"type": "object",
"properties": {
// Field definitions go here
},
"additionalProperties": false,
"required": ["required_field_name"]
}
Required components
Every Snowplow schema must include these components:
$schema
: Must be"http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#"
self
object containing:vendor
: Your organization identifier (e.g.,"com.example"
)name
: The schema nameformat
: Must be"jsonschema"
version
: Semantic version (e.g.,"1-0-0"
)
type
: Must be"object"
for the root level
Optional components
These components are optional but commonly used:
description
: Human-readable description of the schema purpose (highly recommended)properties
: Object defining the fields and their validation rulesadditionalProperties
: Whether additional properties are allowed (commonly set tofalse
)required
: Array of required field namesminProperties
/maxProperties
: Constraints on number of properties
Core validation keywords
Type validation
The type
keyword specifies the expected data type for a value. Snowplow supports all JSON Schema primitive types:
String type
{
"user_name": {
"type": "string",
"description": "The user's display name"
}
}
Number and integer types
{
"price": {
"type": "number",
"description": "Product price in USD"
},
"quantity": {
"type": "integer",
"description": "Number of items purchased"
}
}
Boolean type
{
"is_premium": {
"type": "boolean",
"description": "Whether the user has a premium account"
}
}
Array type
{
"tags": {
"type": "array",
"description": "Product tags",
"items": {
"type": "string"
}
}
}
Object type
{
"address": {
"type": "object",
"description": "User's shipping address",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"postal_code": {"type": "string"}
}
}
}
Null type
{
"middle_name": {
"type": ["string", "null"],
"description": "User's middle name (optional)"
}
}
Multiple types
You can specify multiple acceptable types using an array:
{
"user_id": {
"type": ["string", "integer"],
"description": "User identifier (string or numeric)"
},
"optional_field": {
"type": ["string", "null"],
"description": "Optional text field"
}
}
String validation
Length constraints
Control the minimum and maximum length of string values:
{
"username": {
"type": "string",
"minLength": 3,
"maxLength": 20,
"description": "Username between 3-20 characters"
},
"password": {
"type": "string",
"minLength": 8,
"description": "Password must be at least 8 characters"
}
}
Enumeration
Restrict values to a specific set of allowed strings:
{
"status": {
"type": "string",
"enum": ["active", "inactive", "pending", "suspended"],
"description": "Account status"
},
"color": {
"type": "string",
"enum": ["red", "green", "blue", "yellow"],
"description": "Primary color selection"
}
}
Pattern matching
Use regular expressions to validate string format:
{
"product_code": {
"type": "string",
"pattern": "^[A-Z]{2}-\\d{4}$",
"description": "Product code format (e.g., AB-1234)"
},
"phone_number": {
"type": "string",
"pattern": "^\\+?[1-9]\\d{1,14}$",
"description": "International phone number format"
}
}
For common formats like email addresses, URLs, and dates, prefer using the format
keyword instead of regular expressions for better readability and standardized validation.
Format validation
Use the format
keyword to validate common string formats:
{
"email": {
"type": "string",
"format": "email",
"description": "Valid email address"
},
"website": {
"type": "string",
"format": "uri",
"description": "Website URL"
},
"server_ip": {
"type": "string",
"format": "ipv4",
"description": "IPv4 address of the server"
},
"created_at": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp"
},
"user_id": {
"type": "string",
"format": "uuid",
"description": "UUID identifier"
}
}
Supported format values
uri
: Uniform Resource Identifieripv4
: IPv4 address (e.g., "192.168.1.1")ipv6
: IPv6 addressemail
: Email addressdate-time
: ISO 8601 date-time (e.g., "2023-12-25T10:30:00Z")date
: ISO 8601 date (e.g., "2023-12-25")hostname
: Internet hostnameuuid
: UUID string
Numeric validation
Range constraints
Set minimum and maximum values for numbers and integers:
{
"age": {
"type": "integer",
"minimum": 0,
"maximum": 150,
"description": "Person's age in years"
},
"discount_rate": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Discount rate between 0 and 1"
}
}
Multiple constraints
Combine multiple numeric validations:
{
"rating": {
"type": "number",
"minimum": 1,
"maximum": 5,
"multipleOf": 0.5,
"description": "Star rating in half-point increments"
}
}
Array validation
Length constraints
Control the size of arrays:
{
"favorite_colors": {
"type": "array",
"minItems": 1,
"maxItems": 5,
"description": "User's favorite colors (1-5 selections)",
"items": {
"type": "string",
"enum": ["red", "blue", "green", "yellow", "purple", "orange"]
}
}
}
Item validation
Define validation rules for array items:
{
"purchase_items": {
"type": "array",
"description": "Items in the purchase",
"items": {
"type": "object",
"properties": {
"product_id": {"type": "string"},
"quantity": {"type": "integer", "minimum": 1},
"price": {"type": "number", "minimum": 0}
},
"required": ["product_id", "quantity", "price"],
"additionalProperties": false
}
}
}
Unique items
Ensure all array items are unique:
{
"user_tags": {
"type": "array",
"uniqueItems": true,
"description": "Unique tags assigned to user",
"items": {
"type": "string"
}
}
}
Object validation
Property requirements
Specify which object properties are required:
{
"user_profile": {
"type": "object",
"properties": {
"first_name": {"type": "string"},
"last_name": {"type": "string"},
"email": {"type": "string"},
"phone": {"type": ["string", "null"]}
},
"required": ["first_name", "last_name", "email"],
"additionalProperties": false
}
}
Additional properties
Control whether additional properties are allowed:
{
"strict_object": {
"type": "object",
"properties": {
"name": {"type": "string"},
"value": {"type": "number"}
},
"additionalProperties": false,
"description": "Only name and value properties allowed"
},
"flexible_object": {
"type": "object",
"properties": {
"core_field": {"type": "string"}
},
"additionalProperties": true,
"description": "Additional properties are permitted"
}
}
Property count constraints
Limit the number of properties in an object:
{
"metadata": {
"type": "object",
"minProperties": 1,
"maxProperties": 10,
"additionalProperties": {"type": "string"},
"description": "Metadata with 1-10 string properties"
}
}
Advanced validation patterns
Schema composition
Use oneOf
and anyOf
to create flexible validation rules:
Using oneOf
Validate that data matches exactly one of several schemas:
{
"contact_info": {
"type": "object",
"oneOf": [
{
"properties": {
"type": {"enum": ["email"]},
"email": {"type": "string", "format": "email"}
},
"required": ["type", "email"],
"additionalProperties": false
},
{
"properties": {
"type": {"enum": ["phone"]},
"phone": {"type": "string", "pattern": "^\\+?[1-9]\\d{1,14}$"}
},
"required": ["type", "phone"],
"additionalProperties": false
},
{
"properties": {
"type": {"enum": ["address"]},
"street": {"type": "string"},
"city": {"type": "string"},
"postal_code": {"type": "string"}
},
"required": ["type", "street", "city", "postal_code"],
"additionalProperties": false
}
]
}
}
Using anyOf
Validate that data matches one or more of several schemas:
{
"user_permissions": {
"type": "object",
"anyOf": [
{
"properties": {
"can_read": {"type": "boolean"}
},
"required": ["can_read"]
},
{
"properties": {
"can_write": {"type": "boolean"}
},
"required": ["can_write"]
},
{
"properties": {
"can_admin": {"type": "boolean"}
},
"required": ["can_admin"]
}
]
}
}
Limitations and unsupported features
While Snowplow supports most JSON Schema Draft 4 features, there are some limitations to be aware of:
$ref
: Schema references are not supported in property definitionsallOf
: Schema intersection is not supportednot
: Negation validation is not supporteddependencies
: Property dependencies are not supportedexclusiveMinimum
andexclusiveMaximum
: Exclusive bounds are not supported
Instead of unsupported features, use these approaches:
{
// Instead of $ref, define inline schemas
"address": {
"type": "object",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"country": {"type": "string", "enum": ["US", "CA", "UK", "DE"]}
},
"required": ["street", "city", "country"],
"additionalProperties": false
},
// Instead of exclusiveMinimum/exclusiveMaximum, use minimum/maximum with adjusted values
"percentage": {
"type": "number",
"minimum": 0,
"maximum": 99.99,
"description": "Percentage value (0 to less than 100)"
},
// Use format validation for common patterns
"created_date": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp"
}
}