📘 Introduction

CSV files are everywhere in data work. You might receive exports from business tools, download small datasets, or prepare local files before moving data into a warehouse.

In this tutorial, you will learn how to load CSV data into DuckDB using dlt. DuckDB gives us a fast local analytics database, and dlt helps us turn a simple file load into a repeatable data pipeline.

We will create a small CSV file, build a Python pipeline, load the file into DuckDB, and check the result with SQL.

💡 What are we implementing?

We will build a local pipeline with this flow:

CSV file -> dlt -> DuckDB table -> SQL check

The goal is not only to import a file once. The goal is to understand the basic pattern behind many data engineering workflows:

  • define where the data comes from
  • define where the data should go
  • run the pipeline
  • inspect the loaded table

For a beginner project, this is a great combination because everything runs locally. You do not need a cloud account, database server, or API key.

✅ Prerequisites

Before we start, you should have:

☑️ Python 3.9 or newer installed
☑️ Basic knowledge of running terminal commands
☑️ A text editor such as VS Code
☑️ No API key or cloud account

⚙️1️⃣ Create a project folder

First, create a new folder for the project:

mkdir dlt-csv-duckdb
cd dlt-csv-duckdb

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate

On Windows, activate it with:

.venv\Scripts\activate

📦2️⃣ Install package

Now install dlt with DuckDB support:

pip install "dlt[duckdb]"

This installs dlt together with the DuckDB Python package and the dependencies needed for the DuckDB destination. You do not need to install DuckDB separately for this tutorial.

📄3️⃣ Create a small CSV file

Create a folder named data:

mkdir data

Inside that folder, create a file named customers.csv:

customer_id,name,country,created_at
1,Ana Silva,Portugal,2026-01-10
2,John Miller,United States,2026-01-12
3,Mina Tanaka,Japan,2026-01-14
4,Sofia Garcia,Spain,2026-01-15

Your project should now look like this:

dlt-csv-duckdb/
├── .venv/
└── data/
    └── customers.csv

This public part already gives us a clean project setup and a realistic CSV file. In the Academy section, we continue by building the dlt pipeline, loading the file into DuckDB, checking the table with SQL, and fixing common beginner mistakes.

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In