dbt Snapshots Explained: Track Historical Changes

📘 Introduction

In many data warehouses, source tables only show the current state of a record. A customer might move from trial to paid, a subscription plan might change from basic to pro, or an account might become inactive. If the source system overwrites the old value, your analytics team can lose the history.

In this beginner-friendly tutorial, you will learn what dbt snapshots are, why they matter, and how to use them to track historical changes in your data warehouse. We will use a simple customer status example and create a snapshot that keeps previous versions of each customer record.

📌

This topic is also useful for the dbt Analytics Engineering Certification, because snapshots help you understand slowly changing dimensions, historical tracking, source freshness thinking, and how dbt manages changes in mutable source data.

💡 Why dbt snapshots matter

A dbt snapshot records how rows change over time. This is useful when a source table is mutable, which means old values can be overwritten.

For example, imagine a customers table that looks like this today:

customer_id	customer_name	status	plan	updated_at
1	Ada GmbH	trial	starter	2026-05-01 09:00:00
2	Byte AG	paid	pro	2026-05-01 10:30:00

One week later, customer 1 becomes a paid customer. The source table now shows only the new value:

customer_id	customer_name	status	plan	updated_at
1	Ada GmbH	paid	starter	2026-05-08 12:00:00
2	Byte AG	paid	pro	2026-05-01 10:30:00

Without a snapshot, it is hard to answer questions like: When did the customer convert from trial to paid? How long was the customer in trial? Which plan did the customer have at a specific point in time?

dbt snapshots solve this by storing multiple versions of the same business record.

💡

Think of a snapshot as historical memory for a table that normally only shows the latest state.

🧭 What are we implementing?

We will create a dbt snapshot for a customer source table. The snapshot will track changes to customer status and plan over time.

The workflow will look like this:

source customers table
        ↓
dbt snapshot
        ↓
snapshot table with history
        ↓
analytics query for current and historical states

We will use the timestamp strategy, because our example source table has a reliable updated_at column. This is the recommended strategy when the source system updates that timestamp whenever a row changes.

✅ Prerequisites

Before you begin, make sure you have:

A dbt project set up
dbt installed locally or available in dbt Cloud
A data warehouse connection configured
Basic knowledge of dbt sources and models
A source table or staging model with a unique key and an updated_at column

You can follow the structure even if your warehouse is Snowflake, BigQuery, Databricks, Postgres, or another dbt-supported platform. The exact SQL data loading step may differ by platform, but the dbt snapshot concept stays the same.

If you’d like to dive deeper into dbt (data build tool), our book Building Modern Data Pipelines with dbt: From Raw Data to Gold Standard with the Medallion Architecture provides a hands-on guide to designing modern data pipelines. It covers dbt’s core concepts and best practices, including building Bronze, Silver, and Gold layers with the Medallion Architecture. It also serves as a hands-on study guide for the dbt Analytics Engineering Certification.

View on Amazon

📁1️⃣ Create the snapshots folder

By default, dbt expects snapshot files in a folder called snapshots/ at the root of your dbt project.

Your project can look like this:

my_dbt_project/
├── dbt_project.yml
├── models/
│   └── staging/
│       └── stg_customers.sql
└── snapshots/
    └── customers_snapshot.sql

If your project does not have the folder yet, create it:

mkdir snapshots

🧾2️⃣ Prepare the source data

For this tutorial, imagine you have a source or staging model with customer records.

At the first snapshot run, the data looks like this:

customer_id	customer_name	status	plan	updated_at
1	Ada GmbH	trial	starter	2026-05-01 09:00:00
2	Byte AG	paid	pro	2026-05-01 10:30:00
3	Cloud Factory	trial	starter	2026-05-02 08:15:00

Later, customer 1 changes from trial to paid, and customer 3 upgrades from starter to pro. A snapshot will preserve the old versions instead of losing them.

This is the useful part: even before writing the snapshot file, you should know which column identifies each row and which column tells dbt that a row changed. In this example, customer_id is the unique key and updated_at is the change timestamp.

📝3️⃣ Define the snapshot

Create the file snapshots/customers_snapshot.sql:

{% snapshot customers_snapshot %}

{{
    config(
      target_schema='snapshots',
      unique_key='customer_id',
      strategy='timestamp',
      updated_at='updated_at'
    )
}}

select
    customer_id,
    customer_name,
    status,
    plan,
    updated_at
from {{ ref('stg_customers') }}

{% endsnapshot %}

This tells dbt to track the records returned by the select statement. When a customer row changes and the updated_at value becomes newer, dbt closes the previous version and inserts a new version into the snapshot table.

🎓

Academy tip: before you build a snapshot, always identify the unique key and the change detection column. If either one is wrong, your history can become misleading.

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In

dbt Snapshots Explained: Track Historical Changes in Your Data Warehouse

Data Engineer

GitHub Copilot vs Codex vs Claude Code: What Is the Difference?

How to Use GitHub Copilot in Visual Studio Code

What Is GitHub Copilot? Explained for Beginners

📘 Introduction

💡 Why dbt snapshots matter

🧭 What are we implementing?

✅ Prerequisites

📁1️⃣ Create the snapshots folder

🧾2️⃣ Prepare the source data

📝3️⃣ Define the snapshot

You can view this post with the tier: Academy Membership

Configuring dbt Model Access using Grants: A Practical Example

Understanding dbt State: How state:modified Works with a Practical Example

dbt clone Explained: How to Use dbt clone with a Practical Example

How to Use dbt Snapshots with a Practical Example

dbt Defer Explained: How to Use --defer with a Practical Example