Introduction

Keeping the data fresh in your data warehouse is crucial. In dbt (data build tool), source freshness checks help ensure that the data coming from your sources is up-to-date and trustworthy, enabling teams to confidently build reliable models downstream. This is where defining source freshness in the sources.yml file becomes essential. In this hands-on tutorial, we’ll show you how to define source freshness in your sources.yml file step by step.

📌 This is a must-know topic for the dbt Analytics Engineering Certification Exam, so mastering it now puts you one step closer to passing the exam and leveling up your data engineering skills! 👨‍🎓

✅ Prerequisites

Before you start, make sure you have:

☑️ A dbt project set up
☑️ Source data loaded into your data warehouse
☑️ Source configurations defined in sources.yml

💡What is Source Freshness?

Source freshness in dbt tracks how up-to-date your raw source data is. It measures freshness based on a timestamp column (such as created_atingested_at or modified_at) in your source tables.

When enabled, dbt queries the most recent timestamp in this column and compares it to the current time, checking how "old" the latest record is. You define acceptable limits using the warn_after and error_after thresholds.

Why it matters:

  • Early detection of upstream delays
    Quickly identify issues like broken ETL jobs or paused ingestion in the source systems.
  • Trustworthy data
    If your source data is stale, your dbt models and dashboards will be too. Freshness checks ensure you're always working with the most current data.

✍️1️⃣ Define Your Freshness Requirements

In our example, the student table from the source udc 🟧 contains a modified_at column.

We want to enable freshness checks on the table student from the source udc 🟧, assuming it should be refreshed daily. The following rules should apply:

  • ⚠️ Raise a warning if data is older than 1 day
  • ❌ Raise an error if data is older than 2 days

⚙️2️⃣ Define Source Freshness in sources.yml

Enable freshness checks by defining a freshness block inside your source configuration in sources.yml:

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In