📘 Introduction

When working with data spread across multiple CSV files, combining them into one unified dataset can save time and simplify your workflow. In this guide, you’ll learn how to merge multiple CSV files into a single Pandas DataFrame using just a few lines of Python.

✅ Prerequisites

Before you begin, make sure you have:

🐍☑️ Installed Python
🌐☑️ Created and activated a virtual environment (venv)

📦1️⃣ Install Libraries

Install the following Python packages using pip:

pip install pandas

📁2️⃣ Create Sample CSV files

First, create a folder called student_data/ that contains the following three CSV files:

project-folder/
└── student_data/
   ├── student_data_analytics.csv
   ├── student_data_engineering.csv
   └── student_data_science.csv

Add the following content to the .csv files.

student_data_analytics.csv:

id,name,major
s1,Lara Fitzgerald,Data Analytics
s4,Travis Robinson,Data Analytics
s9,Cooper Harris,Data Analytics

student_data_engineering.csv:

id,name,major
s2,Mike Meyer,Data Engineering
s5,Jackie Brown,Data Engineering
s8,Nathan Williams,Data Engineering

student_data_science.csv:

id,name,major
s3,Eliza Gomez,Data Science
s6,Caleb Pearson,Data Science
s7,Ava Johansson,Data Science
s10,Murphy Fraser,Data Science

🐍3️⃣ Create Python script

In the same folder as the CSV file, create a Python file (.py) or Jupyter notebook (.ipynb):

project-folder/
├──student_data/
   ├── student_data_analytics.csv
   ├── student_data_engineering.csv
   └── student_data_science.csv
└── merge_csv.ipynb

📥4️⃣ Import Libraries

Open your Python file and start by importing the required Python modules:

import pandas as pd
import glob

⬆️5️⃣ Read CSV Files

Read all CSV files within the folder student_data/:

csv_files = glob.glob("student_data/*.csv")

Let's have a look at the CSV files.

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In