Introduction

In this tutorial, we want to concatenate multiple Pandas DataFrames. In order to do this, we use the the concat() function of Pandas.

Import Libraries

First, we import the following python modules:

import pandas as pd

Create Pandas DataFrames

We create two Pandas DataFrames with some example data from a dictionaries.

First, we create the PySpark DataFrame "df1":

data = {
    "language": ["Python", "JavaScript"],
    "framework": ["FastAPI", "ReactJS"],
    "users": [9000, 7000]
}
df1 = pd.DataFrame(data)
df1

Next, we create the Pandas DataFrame "df2". The DataFrame has exactly the same schema like DataFrame "df1":

data = {
    "language": ["Python", "Python", "Java"],
    "framework": ["FastAPI", "Django", "Spring"],
    "users": [9000, 20000, 12000]
}
df2 = pd.DataFrame(data)
df2

Concatenate DataFrames

Now, we would like to concatenate the DataFrames "df1" and "df2".

To do this, we use the concat() function of Pandas:

df_merged = pd.concat([df1, df2], ignore_index=True)
df_merged

Concatenate DataFrames without Duplicates

Next, we would like to concatenate the DataFrames "df1" and "df2" without duplicates.

To do this, we use the concat() function in combination with the drop_duplicates() method of Pandas:

df_merged = pd.concat([df1, df2]).drop_duplicates().reset_index(drop=True)
df_merged

Conclusion

Congratulations! Now you are one step closer to become an AI Expert. You have seen that it is very easy to concatenate multiple Pandas DataFrames. We can simply use the concat() function of Pandas. Try it yourself!