Skip to main content

CDN trackers

You might want to track page views by bots and AI agents to understand how they consume your content. Regular web tracking, however, does not capture these events because many bots and agents don't execute JavaScript.

If your website is deployed through a CDN, you can track events at that level, which would include all requests.

flowchart LR agent(Agent) --> cdn(CDN) -->|serve| website(Website content) cdn -.->|track request| snowplow(Snowplow)
CDN tracking vs. web tracking

CDN-level tracking is complementary to client-side web tracking. While it surfaces traffic invisible to client-side tracking, it has its own blind spots.

For example, for single-page applications (SPAs), client-side tracking will correctly capture multiple page views. At the CDN level, the same set of page views would correspond to only a single request.

Cloudflare

You can add Snowplow tracking to Cloudflare via a Cloudflare Worker. Depending on your monthly number of requests, you will likely require a Workers Paid plan.

Follow Cloudflare documentation to:

If you are already using Workers, you will need to add the code to your existing worker script.

Customization

See the comments in the script for various parts you need to customize.

javascript
// Add your Snowplow collector domain name here.
// IMPORTANT: Do not use domains registered via Cloudflare,
// as that can lead to DNS resolution issues inside a worker script.
// CDI customers can opt for the <name>.collector.snplow.net domain instead.
const collectorUrl =
"https://<collector-domain>/com.snowplowanalytics.snowplow/tp2";

// Headers to forward to Snowplow.
const forwardHeaders = ["referer", "signature-agent", "signature-input", "signature"];

// App id to include with the events.
const appId = "my-app"

// A helper function to decide which requests to track.
// If you track everything, you will receive Snowplow events for requests to static assets, including images, fonts, etc. This is probably too much.
// The example below filters it to just page requests and /llms.txt.
function shouldTrackRequest(pathname) {
const dotIndex = pathname.lastIndexOf('.');
if (dotIndex === -1 || dotIndex < pathname.lastIndexOf('/')) return true;
return pathname.endsWith('llms.txt');
}

async function trackRequest(request) {
const headers = {
"content-type": "application/json",
// This prevents the tracking of the user or agent IP address.
// We recommend anonymous tracking for two reasons:
// 1) There is no mechanism for the user to opt out,
// since the IP address will be collected upon the very first visit.
// 2) With a focus on bot/agent traffic, IP address is not very relevant.
"sp-anonymous": "*",
};
for (const name of forwardHeaders) {
const value = request.headers.get(name);
if (value) headers[name] = value;
}

const payload = {
schema: "iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-4",
data: [
{
e: "pv",
ua: request.headers.get("user-agent"),
aid: appId,
p: "srv",
tv: "cf-worker-1.0.0",
url: request.url,
},
],
};

await fetch(collectorUrl, {
method: "POST",
headers,
body: JSON.stringify(payload),
});
}

export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
const response = await fetch(request);

if (shouldTrackRequest(url.pathname)) {
ctx.waitUntil(trackRequest(request));
}

return response;
},
};
No impact on response time

The ctx.waitUntil call ensures that the request is tracked. It does not add extra latency or block the response to the client.

You will receive the following fields in the events (as well as all relevant fields added by enrichments, e.g., page_urlpath, page_urlquery):

FieldValueDescription
eventpage_viewEvent type: page view
platformsrvPlatform: server-side, to distinguish from browser events
app_idyour app idApplication identifier, so you can filter these events in your data
v_trackercf-worker-1.0.0Tracker version
useragentfrom the original requestThe visitor's User-Agent header
page_urlfrom the original requestThe URL of the requested page

On this page

Want to see a custom demo?

Our technical experts are here to help.