📘 Introduction

In this hands-on dbt tutorial, we'll walk you through how to add and run generic data tests in dbt using YAML configuration. These built-in tests help ensure your data quality and integrity, without writing any custom SQL.

📌 This is a must-know topic for the dbt Analytics Engineering Certification Exam, so mastering it now puts you one step closer to passing the exam and leveling up your data engineering skills! 👨‍🎓

✅ Prerequisites

Before you start, make sure you have:

☑️ A dbt project set up
☑️ At least one model defined in your models/ directory

💡What are generic data tests?

Generic tests are reusable, built-in dbt tests that help catch common data issues. Some of the most commonly used ones include:

  • not_null: Ensures a column has no null values
  • unique: Ensures values in a column are unique
  • accepted_values: Restricts a column to only specific allowed values
  • relationships: Validates referential integrity between models (like foreign keys)

🧪1️⃣ Existing dbt model

We’ve already created a model called cleaned_course_mapping, and it has a corresponding .yml file for documentation and testing.

After running the model, the table in the warehouse looks like this:

As you can see, there are duplicates in the code column - this will be important later when we test for uniqueness.

📝2️⃣ Add data tests to your schema.yml file

Now, let’s define generic tests in cleaned_course_mapping.yml.

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In