Transformer Kafka configuration reference

The configuration reference in this page is written for Transformer Kafka 6.1.3

An example of the minimal required config for the Transformer Kafka can be found here and a more detailed one here.

License

Since version 6.0.0, RDB Loader is released under the Snowplow Limited Use License (FAQ).

To accept the terms of license and run RDB Loader, set the ACCEPT_LIMITED_USE_LICENSE=yes environment variable. Alternatively, you can configure the license.accept option, like this:

license {
  accept = true
}

Parameter	Description
`input.topicName`	Name of the Kafka topic to read from
`input.bootstrapServers`	A list of host:port pairs to use for establishing the initial connection to the Kafka cluster
`input.consumerConf`	Optional. Kafka consumer configuration. See https://kafka.apache.org/documentation/#consumerconfigs for all properties
`output.path`	Azure Blob Storage path to transformer output
`output.compression`	Optional. One of `NONE` or `GZIP`. The default is `GZIP`.
`output.bad.type`	Optional. Either `kafka` or `file`, default value `file`. Type of bad output sink. When `file`, failed events are written as files under URI configured in `output.path`.
`output.bad.topicName`	Required if output type is `kafka`. Name of the Kafka topic that will receive the bad data.
`output.bad.bootstrapServers`	Required if output type is `kafka`. A list of host:port pairs to use for establishing the initial connection to the Kafka cluster
`output.producerConf`	Optional. Kafka producer configuration. See https://kafka.apache.org/documentation/#producerconfigs for all properties
`queue.topicName`	Name of the Kafka topic used to communicate with Loader
`queue.bootstrapServers`	A list of host:port pairs to use for establishing the initial connection to the Kafka cluster
`queue.producerConf`	Optional. Kafka producer configuration. See https://kafka.apache.org/documentation/#producerconfigs for all properties
`monitoring.metrics.*`	Send metrics to a StatsD server or stdout.
`monitoring.metrics.statsd.*`	Optional. For sending metrics (good and bad event counts) to a StatsD server.
`monitoring.metrics.statsd.hostname`	Required if `monitoring.metrics.statsd` section is configured. The host name of the StatsD server.
`monitoring.metrics.statsd.port`	Required if `monitoring.metrics.statsd` section is configured. Port of the StatsD server.
`monitoring.metrics.statsd.tags`	Optional. Tags which are used to annotate the StatsD metric with any contextual information.
`monitoring.metrics.statsd.prefix`	Optional. Configures the prefix of StatsD metric names. The default is `snoplow.transformer`.
`monitoring.metrics.stdout.*`	Optional. For sending metrics to stdout.
`monitoring.metrics.stdout.prefix`	Optional. Overrides the default metric prefix.
`telemetry.disable`	Optional. Set to `true` to disable telemetry.
`telemetry.userProvidedId`	Optional. See here for more information.
`monitoring.sentry.dsn`	Optional. For tracking runtime exceptions.
`featureFlags.enableMaxRecordsPerFile` (since 5.4.0)	Optional, default = true. When enabled, `output.maxRecordsPerFile` configuration parameter is going to be used.
`validations.*`	Optional. Criteria to validate events against
`validations.minimumTimestamp`	This is currently the only validation criterion. It checks that all timestamps in the event are older than a specific point in time, eg `2021-11-18T11:00:00.00Z`.
`featureFlags.*`	Optional. Enable features that are still in beta, or which aim to enable smoother upgrades.
`featureFlags.legacyMessageFormat`	This currently the only feature flag. Setting this to `true` allows you to use a new version of the transformer with an older version of the loader.
`featureFlags.truncateAtomicFields` (since 5.4.0)	Optional, default `false`. When enabled, event's atomic fields are truncated (based on the length limits from the atomic JSON schema) before transformation.

License​

License