Querying your first events
Once you’ve tracked some events, you will want to look at them in your data warehouse or database. The exact steps will depend on your choice of storage and the Snowplow offering.
Connection details
- BDP Enterprise
- BDP Cloud
- Community Edition
You can find the connection details in the Console, under the destination you’ve selected.
You can find the connection details in the Console, under the destination you’ve selected.
- Postgres
- Redshift
- BigQuery
- Snowflake
- Databricks
- Synapse Analytics
Your database will be named according to the postgres_db_name
Terraform variable. It will contain two schemas:
atomic
— the validated eventsatomic_bad
— the failed events
You can connect to the database using the credentials you provided for the loader in the Terraform variables (postgres_db_username
and postgres_db_password
), along with the postgres_db_address
and postgres_db_port
Terraform outputs.
If you need to reset your username or password, you can follow these steps.
See the AWS RDS documentation for more details on how to connect.
If you opted for the secure
option, you will first need to create a tunnel into your VPC to be able to connect to your RDS instance and be able to query the data. A common solution to this issue is to configure a bastion host as described here.
The database name and the schema name will be defined by the redshift_database
and redshift_schema
variables in Terraform.
There are two different ways to login to the database:
- The first option is to use the credentials you configured for the loader in the Terraform variables (
redshift_loader_user
andredshift_loader_password
) - The second option is to grant
SELECT
permissions on the schema to an existing user
To connect, you can use the Redshift UI or something like psql
.
The database will be called <prefix>_snowplow_db
, where <prefix>
is the prefix you picked in your Terraform variables file. It will contain an atomic
schema with your validated events.
You can access the database via the BigQuery UI.
The database name and the schema name will be defined by the snowflake_database
and snowflake_schema
variables in Terraform.
There are two different ways to login to the database:
- The first option is to use the credentials you configured for the loader in the Terraform variables (
snowflake_loader_user
andsnowflake_loader_password
) - The second option is to grant
SELECT
permissions on the schema to an existing user
To connect, you can use either Snowflake dashboard or SnowSQL.
On Azure, you have created an external table in the last step of the guide. Use this table and ignore the text below.
The database name and the schema name will be defined by the databricks_database
and databricks_schema
variables in Terraform.
There are two different ways to login to the database:
- The first option is to use the credentials you configured for the loader in the Terraform variables (
databricks_loader_user
anddatabricks_loader_password
, or alternatively thedatabricks_auth_token
) - The second option is to grant
SELECT
permissions on the schema to an existing user
See the Databricks tutorial for more details on how to connect. The documentation on Unity Catalog is also useful.
In Synapse Analytics, you can connect directly to the data residing in ADLS. You will need to know the names of the storage account (set in the storage_account_name
Terraform variable) and the storage container (it’s a fixed value: lake-container
).
Follow the Synapse documentation and use the OPENROWSET
function. If you created a data source in the last step of the quick start guide, your queries will be a bit simpler.
If you created a OneLake shortcut in the last step of the quick start guide, you will be able to explore Snowplow data in Fabric, for example, using Spark SQL.
Writing queries
Follow our querying guide for more information.