Add GitHub workflow support for tracking plans
We originally called tracking plans "data products". You'll still find the old term used in some existing APIs and CLI commands.
Now that we have our data structures set up, we can define tracking plans to organize and document how these structures are used across our applications. We'll walk through creating source applications, tracking plans, and event specifications using the CLI, then integrate them into our automated workflows.
Create a source application
First, we'll create a source application to represent our website that will send the login event we defined earlier.
snowplow-cli dp generate --source-app website
dp is an alias for data-products, from the previous name for tracking plans. Source applications and event specifications are also managed by this command.
This should provide the following output
INFO generate wrote kind="source app" file=data-products/source-apps/website.yaml
The generated file is written to the default data-products/source-apps directory. Help for all the arguments available to generate is available by running snowplow-cli dp generate --help.
Let's examine the generated file:
apiVersion: v1
resourceType: source-application
resourceName: b8261a25-ee81-4c6a-a94c-7717ba835035
data:
name: website
appIds: []
entities:
tracked: []
enriched: []
apiVersionshould always bev1resourceTypeshould remainsource-applicationresourceNameis a unique identifier of the source applications. It must be a valid uuid v4datais the contents of the source app
For more information about available fields and values you can refer to the source applications schema. Making your ide schema aware via a language server should provide a much smoother editing experience.
Now let's customize our source application. We'll configure it to handle events from our production website as well as staging and UAT environments. We'll also add an owner field and remove the unused entities section.
apiVersion: v1
resourceType: source-application
resourceName: b8261a25-ee81-4c6a-a94c-7717ba835035
data:
name: website
appIds: ["website", "website-stage", "website-ua"]
owner: me@example.com
Before syncing, we can validate our changes and preview what will happen:
snowplow-cli dp sync --dry-run
The command will show us the planned changes:
sync will create source apps file=.../data-products/source-apps/website.yaml name=website resource name=b8261a25-ee81-4c6a-a94c-7717ba835035
When we're happy with the proposed changes, we can sync by removing the --dry-run flag:
snowplow-cli dp sync
After syncing, you'll be able to see your new source application in the Snowplow Console UI.
Create a tracking plan and an event specification
Let's now create a tracking plan and an event specification by running the following command
snowplow-cli dp generate --data-product Login
This should provide the following output
INFO generate wrote kind="data product" file=data-products/login.yaml
Let's see what it has created for us
apiVersion: v1
resourceType: data-product
resourceName: 0edb4b95-3308-40c4-b266-eae2910d5d2a
data:
name: Login
sourceApplications: []
eventSpecifications: []
For more information about available fields and values you can refer to the tracking plans schema. Making your ide schema aware via a language server should provide a much smoother editing experience.
Let's amend it to add an event specification, and a reference to a source application:
apiVersion: v1
resourceType: data-product
resourceName: 0edb4b95-3308-40c4-b266-eae2910d5d2a
data:
name: Login
owner: me@example.com
description: Login page
sourceApplications:
- $ref: ./source-apps/website.yaml
eventSpecifications:
- resourceName: cfb3a227-0482-4ea9-8b0d-f5a569e5d103
name: Login success
event:
source: iglu:com.example/login/jsonschema/1-0-1
You'll need to come up with a valid uuid V4 for the resourceName of an event specification. You can do so by using an online generator, or running the uuidgen command in your terminal
The iglu:com.example/login/jsonschema/1-0-1 data structure has to be deployed at least to a develop envinroment. Currently referencing local data structures is not supported
We can run the same sync --dry-run command as before, to see if the output is as expected. The output should contain the following lines
snowplow-cli dp sync --dry-run
INFO sync will create data product file=.../data-products/login.yaml name=Login resource name=0edb4b95-3308-40c4-b266-eae2910d5d2a
INFO sync will update event specifications file=.../data-products/login.yaml name="Login success" resource name=cfb3a227-0482-4ea9-8b0d-f5a569e5d103 in data product=0edb4b95-3308-40c4-b266-eae29
We can apply the changes by using the sync command without the --dry-run flag
snowplow-cli dp sync
Add tracking plans validation and syncing in the GitHub Actions
Now that we've modeled a source application, tracking plan and event specification, let's see how we can add them to the existing GitHub Actions workflows for data structures. You can customize your setup, use a separate repository or separate actions, but in this example we'll add tracking plan syncing and releasing into the existing workflows.
Let's modify the PR example, and add the following line. This command will validate and print the changes to the GitHub Actions log.
on:
pull_request:
branches: [develop, main]
jobs:
validate:
runs-on: ubuntu-latest
env:
SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }}
SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }}
SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }}
steps:
- uses: actions/checkout@v4
- uses: snowplow/setup-snowplow-cli@v1
- run: snowplow-cli ds validate --gh-annotate
- run: snowplow-cli dp sync --dry-run --gh-annotate
Tracking plans, source applications and event specifications don't have the dev and prod environments, so it's enough to sync them once. When merging to develop, use sync to push your changes as drafts. When merging to main, use release to also mark event specifications as published.
Add the sync command to the develop pipeline:
on:
push:
branches: [develop]
jobs:
publish:
runs-on: ubuntu-latest
env:
SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }}
SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }}
SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }}
steps:
- uses: actions/checkout@v4
- uses: snowplow/setup-snowplow-cli@v1
- run: snowplow-cli ds publish dev --managed-from $GITHUB_REPOSITORY
- run: snowplow-cli dp sync
Add the release command to the production pipeline to publish event specifications when merging to main:
on:
push:
branches: [main]
jobs:
publish:
runs-on: ubuntu-latest
env:
SNOWPLOW_CONSOLE_ORG_ID: ${{ secrets.SNOWPLOW_CONSOLE_ORG_ID }}
SNOWPLOW_CONSOLE_API_KEY_ID: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY_ID }}
SNOWPLOW_CONSOLE_API_KEY: ${{ secrets.SNOWPLOW_CONSOLE_API_KEY }}
steps:
- uses: actions/checkout@v4
- uses: snowplow/setup-snowplow-cli@v1
- run: snowplow-cli ds publish prod --managed-from $GITHUB_REPOSITORY
- run: snowplow-cli dp release