Postgres Loader
With Snowplow Postgres Loader you can load enriched data or failed events into PostgreSQL database.
The Postgres loader is not recommended for production use, especially with large data volumes. We recommend using a fully-fledged data warehouse like Databricks, Snowflake, BigQuery or Redshift, together with a respective loader.
For more information on how events are stored in Postgres, check the mapping between Snowplow schemas and the corresponding Postgres column types.
Available on Terraform Registry
A Terraform module which deploys the Snowplow Postgres Loader on AWS EC2 for use with Kinesis. For installing in other environments, please see the other installation options below.
Getting a Docker image
Snowplow Postgres Loader is published on DockerHub:
docker pull snowplow/snowplow-postgres-loader:0.3.3
It accepts very typical configuration for Snowplow Loader:
docker run --rm \
-v $PWD/config:/snowplow/config \
snowplow/snowplow-postgres-loader:0.3.3 \
--resolver /snowplow/config/resolver.json \
--config /snowplow/config/config.hocon
Iglu
Where resolver.json
is a typical Iglu Client configuration.
Please pay attention that schemas for all self-describing JSONs flowing through Postgres Loader must be hosted on Iglu Server 0.6.0 or above.
Iglu Central is static registry and if you use Snowplow-authored schemas - you need to upload all schemas from there as well.
Configuration
The configuration file is in HOCON format, and it specifies connection details for the target database and the input stream of events.
{
"input": {
"type": "Kinesis"
"streamName": "enriched-events"
"region": "eu-central-1"
}
"output" : {
"good": {
"type": "Postgres"
"host": "localhost"
"database": "snowplow"
"username": "postgres"
"password": ${POSTGRES_PASSWORD}
"schema": "atomic"
}
}
}
The input
section can alternatively specify a GCP PubSub subscription, instead of a kinesis stream like in the example above.
"input": {
"type": "PubSub"
"projectId": "my-project"
"subscriptionId": "my-subscription"
}
See the configuration reference for a complete description of all parameters.
Other
Loader creates events
table on the start and every other table when it first encounters its corresponding schema.
You should ensure that the database and schema specified in the configuration exist before starting the loader.