RDB shredder configuration reference

caution

You are reading documentation for an outdated version. Here’s the latest one!

Shredder and loader use different configurations starting from 2.0.0. An example config for shredder can be found here.

This is a complete list of the options that can be configured

input	Required. S3 url the enriched archive. It must be populated separately with run=YYYY-MM-DD-hh-mm-ss directories.
output.path	Required. S3 url of the shredded output.
output.compression	Optional. One of "NONE" or "GZIP". Default value GZIP.
output.region	Optional if it can be resolved with AWS region provider chain. AWS region of the S3 bucket.
queue.type	Required. Type of the queue. It can be either sqs or sns.
queue.queueName	Required if queue type is sqs. Name of the sqs queue.
queue.topicArn	Required if queue type is sns. ARN of sns topic.
queue.region	Optional if it can be resolved with AWS region provider chain. AWS region of the sqs queue or sns topic.
formats.default	Required, either TSV or JSON. Data format produced by default by the shredder. TSV is recommended as it enables table autocreation, but requires Iglu Server to be available with known schemas (including Snowplow schemas). JSON does not require Iglu Server, but requires Redshift JSONPaths to be configured and does not support table autocreation
formats.tsv	Required, list of iglu uri, but can be set to empty list []. If default is set to JSON these list of schemas will still be shredded into TSV
formats.json	Required, list of iglu uri, but can be set to empty list []. If default is set to TSV these list of schemas will still be shredded into JSON
formats.skip	Required, list of iglu uri, but can be set to empty list []. Schemas for which loading can be skipped.
monitoring.sentry.dsn	Optional. For tracking runtime exceptions.