📘 Introduction

In this tutorial, you’ll learn how to load data from a Parquet file into a DuckDB database using Python. DuckDB’s native Parquet support makes it fast and efficient to work with columnar data, making it ideal for analytics, ETL pipelines, and Python data projects. You’ll see how simple it is to create a table, insert Parquet data, and verify your results—all in just a few steps.

✅ Prerequisites

Before you begin, make sure you have:

🐍☑️ Installed Python
📦☑️ Installed DuckDB
🌐☑️ Created and activated a virtual environment (venv)

📁1️⃣ Sample Parquet file

We've already created a sample Parquet file called student.parquet:

project-folder/
├── student.parquet

💡 Fun Fact: If you’re curious how to convert a CSV into a Parquet file, check out this detailed guide: 

How to convert CSV to Parquet in Python using Pandas
📘 Introduction In modern data workflows, Parquet is a popular columnar storage format for efficient data storage and faster analytics. Converting CSV to Parquet in Python is straightforward using Pandas and PyArrow. In this tutorial, we will walk you through the complete process: from creating a sample CSV file, reading it

🐍2️⃣ Create Python script

In the same folder as the CSV file, create a Python file (.py) or Jupyter notebook (.ipynb):

project-folder/
├── student.parquet
└── insert_data_from_parquet.ipynb

📥3️⃣ Import Libraries

Open your Python file and start by importing the required Python modules:

import duckdb

🔗4️⃣ Establish Connection to DuckDB Database file

Establish the connection to a DuckDB database file:

con = duckdb.connect("dlnerds_university.duckdb")
💡
If the file doesn’t exist, DuckDB will create it automatically.

🏗️5️⃣ Create Table Using SQL CREATE TABLE

Let’s create a simple student table:

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In