Data quality
There are a number of ways you can test and QA your pipeline to follow good data practices.
Testing new tracking implementations, schema changes, and enrichment changesโ
When implementing new tracking, or when making changes to your schemas or enrichments, we recommend you run testing by sending events to a sandbox environment before deploying your changes to Production environments.
- Find the sandbox endpoint in Snowplow Console (Snowplow BDP customers only) - this is accessible on the Environments screen, as well as in the 'Testing details' dialog box on Data Structures and Enrichments screens.
- Send a few events from your application to the sandbox endpoint.
- Visit the OpenSearch Dashboard interface for your sandbox environment to check that your events have landed in the good queue (i.e. are valid) and that the data looks as you expect it to look (i.e. enriched appropriately, formatted and structured correctly).
- Once you are happy that your changes are valid, you can deploy them to Production along with any application code.
Test tracking using automated testingโ
For more automated testing of your tracking we have a tool called Snowplow Micro which allows you to integrate with your automated testing suite to check that your tracking remains intact as application-level changes are made.
Follow this guide to get familiar with Micro and set it up. Next, take a look at the examples of integrating Micro with Nightwatch and Cypress.
๐๏ธ Snowplow Inspector
3 items
๐๏ธ Snowplow Micro
8 items
๐๏ธ Failed events
3 items
๐๏ธ Data Structures CI tool
The Data Structures CI is a command-line tool which integrates Data Structures API into your CI/CD pipelines and currently has one task which verifies that all schema dependencies for a project are already deployed into a specified environment (e.g. "DEV", "PROD").