Introduction

Once you've defined sources in dbt, you can reference them using the source() function. This is essential for dbt to identify dependencies between sources and models. In this hands-on tutorial, we’ll show you how to use the source() function with a sample project.

📌 This is a must-know topic for the dbt Analytics Engineering Certification Exam, so mastering it now puts you one step closer to passing the exam and leveling up your data engineering skills! 👨‍🎓

✅ Prerequisites

Before you start, make sure you have:

☑️ A dbt project set up
☑️ Source data loaded into your data warehouse
☑️ Source configurations defined in sources.yml

💡What is the source() function?

The source() function in dbt is used to reference raw data sources defined in your sources.yml file. It returns a fully qualified table name that dbt compiles into your warehouse-specific SQL syntax, using Jinja, the templating engine that powers all dbt model files.

{{ source("source_name", "table_name") }}
💡
Jinja allows you to embed dynamic Python-like logic inside SQL files — making functions like source() possible.

This source() function enables dbt to:

  • Build dependencies between sources and downstream models, ensuring correct execution and testing order
  • Visualize your project’s DAG (Directed Acyclic Graph) and data lineage, improving understanding and documentation

📦 How Compilation works

When you run dbt compile or dbt run, dbt processes your Jinja-based model files and replaces the source() function with the actual table path. It pulls the databaseschema, and identifier (if provided; otherwise, it uses the table’s name by default) from your sources.yml configuration and constructs the fully qualified table name that matches your warehouse's SQL dialect.

🔍1️⃣ Check sources defined in sources.yml

In our example the file sources.yml has the following content:

version: 2

sources:
  - name: udc
    database: dev_dlnerds_university
    schema: landing 
    tables:
      - name: student
      - name: tutor
      - name: attendance

  - name: sis
    database: dev_dlnerds_university 
    schema: landing
    tables:
      - name: course
      - name: course_mapping
        identifier: course_name

So there are two sources:

🟧 UDC
🟪 SIS

🛠️2️⃣ Use source() function

🟧 Reference source UDC

To reference the student table from the udc source, use:

{{ source("udc", "student") }}
💡
The name of the table is student. When you use student in source(), dbt will resolve it to the corresponding table in the warehouse.

To select data from this table, you can use:

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In