Set up Medallion Architecture with dbt: From Raw Data to Gold Standard

Introduction

In the age of data-driven decision making, having a powerful data architecture is crucial. The Medallion Architecture is a proven data architecture pattern that helps in organizing data across different layers of refinement. When combined with dbt (data build tool), it becomes a powerful and scalable way to manage your data transformation pipelines.

In this post, we'll walk you through how to set up a Medallion Architecture using dbt.

✅ Prerequisites

Before you start, make sure you have:

☑️ A dbt project set up

🏅What is the Medallion Architecture?

The Medallion Architecture is a layered approach to organizing data in a lakehouse or warehouse environment. It divides the transformation process into logical stages that promote modularity, maintainability, and scalability.

In this approach there are the following layers:

🛬 Landing (Raw Drop Zone) (Optional but Common)
Landing is the initial zone where raw data first arrives, stored exactly as received—without any validation or transformation. It serves as the foundation for all downstream processing.
🥉 Bronze (Standardized)
The Bronze layer contains the raw data in a standardized format. While minimal transformations are applied, such as adding metadata like load timestamps and source file names, the core content remains close to its original state.
🥈 Silver (Cleaned)
The Silver layer holds data that has been filtered and cleaned. This stage addresses fundamental data quality issues, such as duplicates, nulls, and structural inconsistencies. The result is a more refined and trusted dataset.
🥇 Gold (Transformed / Business-Ready)
The Gold layer contains fully transformed and aggregated data that meets specific business requirements. At this stage, data quality is at its highest and the datasets are structured for usability - typically modeled in formats like star schemas to support efficient analytics and reporting. These models include calculated metrics, KPIs, and curated dimensional views that represent the “gold standard” for decision-making and downstream consumption.

🗂️1️⃣ Structure your dbt project

In your models/ directory, create the following folders

01_landing
02_bronze
03_silver
04_gold

The structure of your dbt project then looks as follows:

⚙️2️⃣ Set up configuration in `dbt_project.yml`

Update your dbt_project.yml to:

Map the directories to schemas
Assign materialization strategies (like view, table)
Add tags for easy selection
Optionally assign colors for visual clarity in documentation

Here’s how it should look like:

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In

Set up Medallion Architecture with dbt: From Raw Data to Gold Standard

Data Engineer

Using CTEs in your dbt models to ensure modularity and follow DRY principles

Describing and Documenting your Data in schema.yml in dbt

Applying DRY Principles in dbt: Tips and Best Practices

Introduction

✅ Prerequisites

🏅What is the Medallion Architecture?

🗂️1️⃣ Structure your dbt project

⚙️2️⃣ Set up configuration in `dbt_project.yml`

You can view this post with the tier: Academy Membership

Using CTEs in your dbt models to ensure modularity and follow DRY principles

Describing and Documenting your Data in schema.yml in dbt

Applying DRY Principles in dbt: Tips and Best Practices

How to pass the dbt Analytics Engineering Certification Exam: Preparation Tips and Learning Materials

8 Essential Tips every dbt Developer should know: Level Up Your dbt Development

Set up Medallion Architecture with dbt: From Raw Data to Gold Standard

Data Engineer

Using CTEs in your dbt models to ensure modularity and follow DRY principles

Describing and Documenting your Data in schema.yml in dbt

Applying DRY Principles in dbt: Tips and Best Practices

Introduction

✅ Prerequisites

🏅What is the Medallion Architecture?

🗂️1️⃣ Structure your dbt project

⚙️2️⃣ Set up configuration in dbt_project.yml

You can view this post with the tier: Academy Membership

Using CTEs in your dbt models to ensure modularity and follow DRY principles

Describing and Documenting your Data in schema.yml in dbt

Applying DRY Principles in dbt: Tips and Best Practices

How to pass the dbt Analytics Engineering Certification Exam: Preparation Tips and Learning Materials

8 Essential Tips every dbt Developer should know: Level Up Your dbt Development

⚙️2️⃣ Set up configuration in `dbt_project.yml`