Skip to main content

JSON Schema reference

Snowplow schemas are based on the JSON Schema standard (draft 4). This reference provides comprehensive documentation for all JSON Schema features that are supported in Snowplow.

Understanding the full capabilities of JSON Schema allows you to create more precise and robust data structures that ensure your data quality and provide clear documentation for your tracking implementation.

Schema structure

Every Snowplow schema must follow this basic structure with Snowplow-specific metadata and JSON Schema validation rules:

{
"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"description": "Human-readable description of the schema purpose",
"self": {
"vendor": "com.example",
"name": "schema_name",
"format": "jsonschema",
"version": "1-0-0"
},
"type": "object",
"properties": {
// Field definitions go here
},
"additionalProperties": false,
"required": ["required_field_name"]
}

Required components

Every Snowplow schema must include these components:

  • $schema: Must be "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#"
  • self object containing:
    • vendor: Your organization identifier (e.g., "com.example")
    • name: The schema name
    • format: Must be "jsonschema"
    • version: Semantic version (e.g., "1-0-0")
  • type: Must be "object" for the root level

Optional components

These components are optional but commonly used:

  • description: Human-readable description of the schema purpose (highly recommended)
  • properties: Object defining the fields and their validation rules
  • additionalProperties: Whether additional properties are allowed (commonly set to false)
  • required: Array of required field names
  • minProperties / maxProperties: Constraints on number of properties

Core validation keywords

Type validation

The type keyword specifies the expected data type for a value. Snowplow supports all JSON Schema primitive types:

String type

{
"user_name": {
"type": "string",
"description": "The user's display name"
}
}

Number and integer types

{
"price": {
"type": "number",
"description": "Product price in USD"
},
"quantity": {
"type": "integer",
"description": "Number of items purchased"
}
}

Boolean type

{
"is_premium": {
"type": "boolean",
"description": "Whether the user has a premium account"
}
}

Array type

{
"tags": {
"type": "array",
"description": "Product tags",
"items": {
"type": "string"
}
}
}

Object type

{
"address": {
"type": "object",
"description": "User's shipping address",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"postal_code": {"type": "string"}
}
}
}

Null type

{
"middle_name": {
"type": ["string", "null"],
"description": "User's middle name (optional)"
}
}

Multiple types

You can specify multiple acceptable types using an array:

{
"user_id": {
"type": ["string", "integer"],
"description": "User identifier (string or numeric)"
},
"optional_field": {
"type": ["string", "null"],
"description": "Optional text field"
}
}

String validation

Length constraints

Control the minimum and maximum length of string values:

{
"username": {
"type": "string",
"minLength": 3,
"maxLength": 20,
"description": "Username between 3-20 characters"
},
"password": {
"type": "string",
"minLength": 8,
"description": "Password must be at least 8 characters"
}
}

Enumeration

Restrict values to a specific set of allowed strings:

{
"status": {
"type": "string",
"enum": ["active", "inactive", "pending", "suspended"],
"description": "Account status"
},
"color": {
"type": "string",
"enum": ["red", "green", "blue", "yellow"],
"description": "Primary color selection"
}
}

Pattern matching

Use regular expressions to validate string format:

{
"product_code": {
"type": "string",
"pattern": "^[A-Z]{2}-\\d{4}$",
"description": "Product code format (e.g., AB-1234)"
},
"phone_number": {
"type": "string",
"pattern": "^\\+?[1-9]\\d{1,14}$",
"description": "International phone number format"
}
}
tip

For common formats like email addresses, URLs, and dates, prefer using the format keyword instead of regular expressions for better readability and standardized validation.

Format validation

Use the format keyword to validate common string formats:

{
"email": {
"type": "string",
"format": "email",
"description": "Valid email address"
},
"website": {
"type": "string",
"format": "uri",
"description": "Website URL"
},
"server_ip": {
"type": "string",
"format": "ipv4",
"description": "IPv4 address of the server"
},
"created_at": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp"
},
"user_id": {
"type": "string",
"format": "uuid",
"description": "UUID identifier"
}
}

Supported format values

  • uri: Uniform Resource Identifier
  • ipv4: IPv4 address (e.g., "192.168.1.1")
  • ipv6: IPv6 address
  • email: Email address
  • date-time: ISO 8601 date-time (e.g., "2023-12-25T10:30:00Z")
  • date: ISO 8601 date (e.g., "2023-12-25")
  • hostname: Internet hostname
  • uuid: UUID string

Numeric validation

Range constraints

Set minimum and maximum values for numbers and integers:

{
"age": {
"type": "integer",
"minimum": 0,
"maximum": 150,
"description": "Person's age in years"
},
"discount_rate": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Discount rate between 0 and 1"
}
}

Multiple constraints

Combine multiple numeric validations:

{
"rating": {
"type": "number",
"minimum": 1,
"maximum": 5,
"multipleOf": 0.5,
"description": "Star rating in half-point increments"
}
}

Array validation

Length constraints

Control the size of arrays:

{
"favorite_colors": {
"type": "array",
"minItems": 1,
"maxItems": 5,
"description": "User's favorite colors (1-5 selections)",
"items": {
"type": "string",
"enum": ["red", "blue", "green", "yellow", "purple", "orange"]
}
}
}

Item validation

Define validation rules for array items:

{
"purchase_items": {
"type": "array",
"description": "Items in the purchase",
"items": {
"type": "object",
"properties": {
"product_id": {"type": "string"},
"quantity": {"type": "integer", "minimum": 1},
"price": {"type": "number", "minimum": 0}
},
"required": ["product_id", "quantity", "price"],
"additionalProperties": false
}
}
}

Unique items

Ensure all array items are unique:

{
"user_tags": {
"type": "array",
"uniqueItems": true,
"description": "Unique tags assigned to user",
"items": {
"type": "string"
}
}
}

Object validation

Property requirements

Specify which object properties are required:

{
"user_profile": {
"type": "object",
"properties": {
"first_name": {"type": "string"},
"last_name": {"type": "string"},
"email": {"type": "string"},
"phone": {"type": ["string", "null"]}
},
"required": ["first_name", "last_name", "email"],
"additionalProperties": false
}
}

Additional properties

Control whether additional properties are allowed:

{
"strict_object": {
"type": "object",
"properties": {
"name": {"type": "string"},
"value": {"type": "number"}
},
"additionalProperties": false,
"description": "Only name and value properties allowed"
},
"flexible_object": {
"type": "object",
"properties": {
"core_field": {"type": "string"}
},
"additionalProperties": true,
"description": "Additional properties are permitted"
}
}

Property count constraints

Limit the number of properties in an object:

{
"metadata": {
"type": "object",
"minProperties": 1,
"maxProperties": 10,
"additionalProperties": {"type": "string"},
"description": "Metadata with 1-10 string properties"
}
}

Advanced validation patterns

Schema composition

Use oneOf and anyOf to create flexible validation rules:

Using oneOf

Validate that data matches exactly one of several schemas:

{
"contact_info": {
"type": "object",
"oneOf": [
{
"properties": {
"type": {"enum": ["email"]},
"email": {"type": "string", "format": "email"}
},
"required": ["type", "email"],
"additionalProperties": false
},
{
"properties": {
"type": {"enum": ["phone"]},
"phone": {"type": "string", "pattern": "^\\+?[1-9]\\d{1,14}$"}
},
"required": ["type", "phone"],
"additionalProperties": false
},
{
"properties": {
"type": {"enum": ["address"]},
"street": {"type": "string"},
"city": {"type": "string"},
"postal_code": {"type": "string"}
},
"required": ["type", "street", "city", "postal_code"],
"additionalProperties": false
}
]
}
}

Using anyOf

Validate that data matches one or more of several schemas:

{
"user_permissions": {
"type": "object",
"anyOf": [
{
"properties": {
"can_read": {"type": "boolean"}
},
"required": ["can_read"]
},
{
"properties": {
"can_write": {"type": "boolean"}
},
"required": ["can_write"]
},
{
"properties": {
"can_admin": {"type": "boolean"}
},
"required": ["can_admin"]
}
]
}
}

Limitations and unsupported features

While Snowplow supports most JSON Schema Draft 4 features, there are some limitations to be aware of:

  • $ref: Schema references are not supported in property definitions
  • allOf: Schema intersection is not supported
  • not: Negation validation is not supported
  • dependencies: Property dependencies are not supported
  • exclusiveMinimum and exclusiveMaximum: Exclusive bounds are not supported

Instead of unsupported features, use these approaches:

{
// Instead of $ref, define inline schemas
"address": {
"type": "object",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"country": {"type": "string", "enum": ["US", "CA", "UK", "DE"]}
},
"required": ["street", "city", "country"],
"additionalProperties": false
},

// Instead of exclusiveMinimum/exclusiveMaximum, use minimum/maximum with adjusted values
"percentage": {
"type": "number",
"minimum": 0,
"maximum": 99.99,
"description": "Percentage value (0 to less than 100)"
},

// Use format validation for common patterns
"created_date": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp"
}
}