Introduction

One of the key tasks in data analysis is grouping data to gain insights and make informed decisions. In this tutorial, we will show you how to group the rows of a Pandas DataFrame and apply different aggregations on the grouped data. In order to do this, we will use the groupby() method of Pandas in combination with various aggregation functions.

Import Libraries

First, we import the following python modules:

import pandas as pd

Create Pandas DataFrame

Next, we create a Pandas DataFrame with some example data from a dictionary:

data = {
    "language": ["Python", "Python", "Java", "JavaScript", "Python"],
    "framework": ["Django", "FastAPI", "Spring", "ReactJS", "FastAPI"],
    "users": [20000, 9000, 7000, 5000, 13000]
}
df = pd.DataFrame(data)
df

Group DataFrame and Apply Aggregations

The groupby() method of Pandas allows you to group data of a Pandas DataFrame based on one or more columns.

Once grouped, you can use various aggregation functions to summarize the grouped data. For example, you could use one of the following aggregation functions:

  • Calculate number of rows for each group: count()
  • Calculate minimum of values for each group: min()
  • Calculate maximum of values for each group: max()
  • Calculate sum of values for each group: sum()
  • Calculate mean of values for each group: mean()

Group DataFrame by Single Column

We want to group the rows of the Pandas DataFrame based on the column "language". Besides, we want to calculate the mean of the column "users" for each group.

To do this, we use the groupby() method in combination with the mean() method of Pandas:

grouped_df = df.groupby("language")["users"].mean()
grouped_df

Group DataFrame by Multiple Columns

We want to group the rows of the Pandas DataFrame based on the columns "language" and "framework". Besides, we want to calculate the sum of the column "users" for each group.

To do this, we use the groupby() method in combination with the sum() method of Pandas:

grouped_df = df.groupby(["language", "framework"])["users"].sum()
grouped_df

Conclusion

Congratulations! Now you are one step closer to become an AI Expert. In this blog post, we've explored the basics of grouping data in Pandas DataFrames. This functionality is crucial for data analysis and gaining insights into large datasets.

You have seen that it is very easy to group data of a Pandas DataFrame and apply different aggregations to the grouped data. We can simply use the groupby() method in combination with specific aggregation methods of Pandas like count(), sum() or mean(). Try it yourself!

Instagram

Also check out our Instagram page. We appreciate your like or comment. Feel free to share this post with your friends.