Introduction

Seeds in dbt are a simple and powerful way to load static data - such as reference tables, lookup data, or mapping tables - directly into your data warehouse. In this tutorial, you'll learn what seeds are and how to use them effectively.

Since this topic is relevant for the dbt Analytics Engineering Certification Exam, this guide will be a valuable resource on your way passing the exam. 👨‍🎓

✅ Prerequisites

Before you begin, ensure you have set up a dbt project. Your project structure should look like this:

If you haven’t created a dbt project yet, check out this step-by-step guide:

Set up a new dbt Project from Scratch: A Beginner’s Guide
Introduction Want to start with dbt core but don’t know where to begin? Don’t worry! In this tutorial, we’ll walk through setting up a new dbt project from scratch - we cover the entire process from creating a virtual environment to initializing your project and verifying the setup.

💡What are Seeds?

In dbt, seeds are CSV files stored in your project’s seeds/ directory. When you run dbt seed, dbt reads these files and loads them into your warehouse as database tables.

💡
When to use Seeds?
Seeds are great for managing version-controlled static data, for example:
- Mapping tables (e.g., country codes, department codes)
- Small lookup tables
- Test datasets

🛠️1️⃣ Specify schema for seeds in dbt_project.yml 

To control where your seeds are loaded, define the schema in your dbt_project.yml. In our case all seeds should be loaded into the schema landing. In order to do this, specify +schema: landing in the seeds: section.

name: 'dlnerds_university'
version: '1.0.0'

profile: 'dlnerds_university'

model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets:
  - "target"
  - "dbt_packages"


seeds:
  dlnerds_university:
    +schema: landing
    +docs:
      node_color: 'blue'

models:
  dlnerds_university:

📁2️⃣ Save CSV file in the seeds/ directory

Let’s take a look at an example CSV file named course_name.csv, which contains the following data:

Code,Name
DBA,Databases
DBA,Databases
PRG,Programing
PRG,Programing
STA,Statistics
STA,Statistics
DEN,Data Exploration
DCG,Data Cleansing
MLG,Machine Learning
DMN,Data Modeling
DMN,Data Modeling
DMN,Data Modeling
ETL,Extract Transform Load
DLG,Deep Learning
DLG,Deep Learning
DVN,Data Visualization
DVN,Data Visualization
DVN,Data Visualization
API,Application Programming Interface
CVN,Computer Vision
CVN,Computer Vision

Save the CSV file in the seeds/ directory.

🚀3️⃣ Load Data into warehouse

Run the following command to load the seed into your warehouse:

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In