Managing data structures via the CLI
The data-structures
subcommand of Snowplow CLI provides a collection of functionality to ease the integration of custom development and publishing workflows.
Snowplow CLI Prerequisitesโ
Installโ
Snowplow CLI can be installed with homebrew
brew install snowplow-product/taps/snowplow-cli
Verify the installation with
snowplow-cli --help
For systems where homebrew is not available binaries for multiple platforms can be found in releases.
Example installation for linux_x86_64
using curl
curl -L -o snowplow-cli https://github.com/snowplow-product/snowplow-cli/releases/latest/download/snowplow-cli_linux_x86_64
chmod u+x snowplow-cli
Verify the installation with
./snowplow-cli --help
Configureโ
You will need three values.
API Key Id and API Key Secret are generated from the credentials section in BDP Console.
Organization Id can be retrieved from the URL immediately following the .com when visiting BDP console:
Snowplow CLI can take its configuration from a variety of sources. More details are available from ./snowplow-cli data-structures --help
. Variations on these three examples should serve most cases.
- env variables
- $HOME/.config/snowplow/snowplow.yml
- inline arguments
SNOWPLOW_CONSOLE_API_KEY_ID=********-****-****-****-************
SNOWPLOW_CONSOLE_API_KEY=********-****-****-****-************
SNOWPLOW_CONSOLE_ORG_ID=********-****-****-****-************
console:
api-key-id: ********-****-****-****-************
api-key: ********-****-****-****-************
org-id: ********-****-****-****-************
./snowplow-cli data-structures --api-key-id ********-****-****-****-************ --api-key ********-****-****-****-************ --org-id ********-****-****-****-************
Snowplow CLI defaults to yaml format. It can be changed to json by either providing a --output-format json
flag or setting the output-format: json
config value. It will work for all commands where it matters, not only for generate
.
Available commandsโ
Creating data structuresโ
./snowplow-cli ds generate login_click ./folder-name
Will create a minimal data structure template in a new file ./folder-name/login_click.yaml
. Note that you will need to add a vendor name to the template before it will pass validation. Alternatively supply a vendor at creation time with the --vendor com.acme
flag.
Downloading data structuresโ
./snowplow-cli ds download
This command will retrieve all organization data structures. By default it will create a folder named data-structures
in the current working directory to put them in. It uses a combination of vendor and name to further break things down.
Given a data structure with vendor: com.acme
and name: link_click
and assuming the default format of yaml the resulting folder structure will be ./data-structures/com.acme/link_click.yaml
.
Validating data structuresโ
./snowplow-cli ds validate ./folder-name
This command will find all files under ./folder-name
(if omitted then ./data-structures
) and attempt to validate them using BDP console. It will assert the following
- Is each file a valid format (yaml/json) with expected fields
- Does the schema in the file conform to snowplow expectations
- Given the organization's loading configuration will any schema version number choices have a potentially negative effect on data loading
If any validations fail the command will report the problems to stdout and exit with status code 1.
Publishing data structuresโ
./snowplow-cli ds publish dev ./folder-name
This command will find all files under ./folder-name
(if omitted then ./data-structures
) and attempt to publish them to BDP console in the environment provided (dev
or prod
).
Publishing to dev
will also cause data structures to be validated with the validate
command before upload. Publishing to prod
will not validate but requires all data structures referenced to be present on dev
.