Introduction
In this tutorial, we want to encode ordinal categorical variables of a Pandas DataFrame. In order to do this, we use the replace() method of Pandas.
Import Libraries
First, we import the following python modules:
import pandas as pd
Create Pandas DataFrame
Next, we create a Pandas DataFrame with some example data from a dictionary:
data = {
"language": ["Python", "Python", "Java", "JavaScript"],
"framework": ["Django", "FastAPI", "Spring", "ReactJS"],
"users": [20000, 9000, 7000, 5000],
"popularity": ["High", "High", "Low", "Medium"]
}
df = pd.DataFrame(data)
df
Encode Ordinal Categorical Variables
Now, we would like to convert the categorical values of column "language" into numerical values. We have to consider the rank order of the different elements:
High > Medium > Low
To encode the categorical values, we use the replace() method of Pandas and pass a dictionary with the mapping between categorical and numerical values:
mapping = {
"Low": 1,
"Medium": 2,
"High": 3
}
df["popularity"] = df["popularity"].replace(mapping)
df
Conclusion
Congratulations! Now you are one step closer to become an AI Expert. You have seen that it is very easy to encode ordinal categorical variables of a Pandas DataFrame. We can simply use the replace() method of Pandas. Try it yourself!