Telemetry principles
Telemetry helps us better understand how our applications are used:
- Which applications, clouds and warehouses are more popular than others?
- What are the most common pipeline topologies?
- Are users successful in running our stack over long periods of time?
- And so on.
This data is important for us when deciding where to invest our efforts to build a better product for our users (including you!).
What data is collected?β
In general, we track:
- Heartbeat events that tell us Snowplow applications are alive.
- Events regarding installation, startup, shutdown, etc, of our Terraform modules and applications.
- Metadata such as application version, cloud and region.
You can always disable telemetry if you prefer.
Our principlesβ
Privacyβ
We do not automatically collect any personally identifiable information (PII) other than the IP address of the computer where a Snowplow application or a terraform module is running. This IP address is subsequently pseudonymised using SHA-256 with the Snowplow PII pseudonymization enrichment.
Minimalismβ
We only ever collect what is required at any given point in time. We do not pre-empt future requirements or collect anything βjust in caseβ. We also make sure telemetry does not affect application performance in any way.
Transparencyβ
Not only is our telemetry code open source (e.g. this terraform module), you can also inspect the schema we use for our telemetry events here.
How can I help?β
It helps our product development immensely if you keep telemetry enabled. We promise to keep it anonymous and as minimal as possible!
We also appreciate if you provide your email (or just a UUID) in the user_provided_id
(or userProvidedId
) setting. This allows us to tie events together across resources and offers a more complete picture of how the pipeline has been orchestrated. If you do provide an email address, we will only ever contact you with exciting Product & Engineering updates and Research studies. You can always exercise your right to be forgotten by contacting us.
Which components have telemetry?β
At the moment, opt-out telemetry is present in the following:
- Terraform modules for the quick start guide.
- Collector.
- Enrich (Enrich Kinesis, Enrich PubSub, Enrich Kafka.
- RDB Loader (Transformer Kinesis, Transformer PubSub, Redshift Loader, Snowflake Loader, Databricks Loader).
- Snowplow Mini for AWS and GCP.
- Snowbridge.
- Lake loader.
See the telemetry notice for each component linked above for more details.