📘Introduction

In this hands-on dbt tutorial, we'll walk you through how to describe and document your data models using schema.yml in dbt.

📌 This is a must-know topic for the dbt Analytics Engineering Certification Exam, so mastering it now puts you one step closer to passing the exam and leveling up your data engineering skills! 👨‍🎓

✅ Prerequisites

Before you start, make sure you have:

☑️ A dbt project set up
☑️ At least one model defined in your models/ directory

💡What is schema.yml?

The schema.yml file in dbt is used to document your models, columns, and sources in a structured way. In addition to descriptions, it supports testsdata contractsgrants, and custom metadata - enabling teams to enforce data quality, access control, and governance policies. This makes your dbt project more reliable, secure, and collaborative.

🧪1️⃣ Create a dbt model

We've already created a dbt model called cleaned_student.

The model contains the following SQL code:

SELECT
    ID AS id,
    Name AS name,
    Major AS major,
    Number AS number,
    'udc' AS source_name,
FROM {{ source("udc", "student") }}

📝2️⃣ Create schema.yml

Now let’s document this model. To do this, create a file named schema.yml in your models/ folder.

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In