📘 Introduction

If you enjoy working with pandas but wish you could use clean, powerful SQL at any time, then DuckDB is the right tool for you. With DuckDB, you can query your DataFrames instantly without having to set up a database, run a server, or change your workflow.

✅ Prerequisites

Before you begin, make sure you have:

🐍☑️ Installed Python
📦☑️ Installed DuckDB and Pandas via pip
🌐☑️ Created and activated a virtual environment (venv)

📥1️⃣ Import Libraries

Start by importing the required Python modules:

import duckdb
import pandas as pd

🐼2️⃣ Create Sample Pandas DataFrame

Let’s create a sample Pandas DataFrame.

data = [
    ("s1", "Lara Fitzgerald", "05.08.02", "Data Analytics"),
    ("s2", "Mike Meyer", "20.05.03", "Data Engineering"),
    ("s3", "Eliza Gomez", "01.01.02", "Data Science"),
    ("s4", "Travis Robinson", "19.12.01", "Data Analytics"),
    ("s5", "Jackie Brown", "23.05.03", "Data Engineering"),
    ("s6", "Caleb Pearson", "02.03.00", "Data Science"),
    ("s7", "Ava Johansson", "26.07.00", "Data Science"),
    ("s8", "Nathan Williams", "20.11.02", "Data Engineering"),
    ("s9", "Cooper Harris", "08.06.01", "Data Analytics"),
    ("s10", "Murphy Fraser", "01.12.03", "Data Science")
]

df = pd.DataFrame(
    data,
    columns=["id", "name", "date_of_birth", "major"]
)

print(df)

Output:

    id             name date_of_birth             major
0   s1  Lara Fitzgerald      05.08.02    Data Analytics
1   s2       Mike Meyer      20.05.03  Data Engineering
2   s3      Eliza Gomez      01.01.02      Data Science
3   s4  Travis Robinson      19.12.01    Data Analytics
4   s5     Jackie Brown      23.05.03  Data Engineering
5   s6    Caleb Pearson      02.03.00      Data Science
6   s7    Ava Johansson      26.07.00      Data Science
7   s8  Nathan Williams      20.11.02  Data Engineering
8   s9    Cooper Harris      08.06.01    Data Analytics
9  s10    Murphy Fraser      01.12.03      Data Science

🐤3️⃣ Query DataFrame via DuckDB

DuckDB allows you to run SQL queries directly on your Pandas DataFrames. For example, let’s select all students whose major is Data Engineering:

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In