Using the Data Structures CI tool
The Data Structures CI is a command-line tool which integrates Data Structures API into your CI/CD pipelines and currently has one task which verifies that all schema dependencies for a project are already deployed into a specified environment (e.g. "DEV", "PROD").
This is available as a Github Action and as a universal install for other deployment pipelines e.g. Travis CI, CircleCI, Gitlab, Azure Pipelines, Jenkins…
Authorization
In order to be able to perform tasks with the tool, you will need to supply both your Organization ID and an API key.
The Organization ID is a UUID that can be retrieved from the URL immediately following the .com when visiting console
An API Key can be created here.
Create your manifest file
This command allows you to verify that all schema dependencies for a project (declared in a specific "manifest") are already deployed into an environment (e.g. "DEV", "PROD").
In your application project, create a JSON file for your manifest that will store references to the schema dependencies you have for your project. During a CI build this file will be parsed, validated and used by Data Structures CI to check that each schema is correctly deployed to the appropriate environment before the code for the application gets deployed, effectively guarding against the 'Schema not found' type of failed events.
Here is an example manifest file where our application has dependencies on three schemas:
checkout_process
version1-0-7
user
version1-0-1
product
version2-0-0
{
"schema": "iglu:com.snowplowanalytics.insights/data_structures_dependencies/jsonschema/1-0-0",
"data": {
"schemas": [
{
"vendor": "com.acme.marketing",
"name": "checkout_process",
"format": "jsonschema",
"version": "1-0-7"
},
{
"vendor": "com.acme",
"name": "user",
"format": "jsonschema",
"version": "1-0-1"
},
{
"vendor": "com.acme",
"name": "product",
"format": "jsonschema",
"version": "2-0-0"
}
]
}
}
The manifest must adhere to this self-describing JSON Schema.
Setting up as a Github Action
To use the Github Action simply add this snippet as a step on your existing GitHub Actions pipeline, replacing the relevant variables:
name: Example workflow using Snowplow's Data Structures CI
on: push
jobs:
data-structures-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: Run Snowplow's Data Structures CI
uses: snowplow-product/msc-schema-ci-action/check@v1
with:
organization-id: ${{ secrets.SNOWPLOW_ORG_ID }}
api-key: ${{ secrets.SNOWPLOW_API_KEY }}
manifest-path: 'snowplow-schemas.json'
environment: ${{ env.ENVIRONMENT }}
View the Github Action repository.
Setting up for other deployment pipelines
Prerequisites
- JRE 8 or above
Download the CI tool
You can download Data Structures CI from our Bintray repository, using the following command:
$ curl -L https://github.com/snowplow-product/msc-schema-ci-tool/releases/download/1.0.0/data_structures_ci_1.0.0.zip | jar xv && chmod +x ./data-structures-ci
Run the task
You can run the task using the following syntax:
$ export ORGANIZATION_ID=<organization-id>
$ export API_KEY=<api-key>
$ ./data-structures-ci check \
--manifestPath /path/to/snowplow-schemas.json \
--environment DEV
View the repository for integration examples.