Python

Python

83 posts
PySpark - Group and Concatenate Strings in a DataFrame
Academy Membership PySparkPython

PySpark - Group and Concatenate Strings in a DataFrame

Introduction In this tutorial, we will show you how to group and concatenate strings in a PySpark DataFrame. In order to do this, we will use the groupBy() method in combination with the functions concat_ws(), collect_list() and array_distinct() of PySpark. Import Libraries First, we import the following...

PySpark - How to use Pandas User Defined Function (UDF)
Academy Membership PySparkPython

PySpark - How to use Pandas User Defined Function (UDF)

Introduction In the realm of big data processing, PySpark has emerged as a powerful tool for handling large-scale datasets. Its distributed computing framework allows for efficient processing of massive volumes of data. However, despite its capabilities, performing certain data transformations in PySpark can sometimes be cumbersome and complex. That'...

Type Hints in Python: A Guide for Beginners
Academy Membership Python

Type Hints in Python: A Guide for Beginners

Introduction As projects grow in size and complexity, it becomes increasingly important to ensure that code remains understandable and easy to work with. One powerful tool for achieving this is the use of type hints. In this tutorial, we will explain why and how to use type hints in Python....

Pandas - Change Column Types of a DataFrame

Pandas - Change Column Types of a DataFrame

Introduction Data manipulation tasks often involve converting column data types to ensure consistency and accuracy in analysis. In this tutorial, we will show you how to change column types of a Pandas DataFrame. In order to do this, we will use the astype() method, the map() method and the to_...

How to use Environment Variables in Python
Academy Membership Python

How to use Environment Variables in Python

Introduction Environment variables are used for securely storing and accessing sensitive data, facilitating seamless configuration management across different environments. In this tutorial, we will explore how to work with environment variables in Python. In order to do this, we will use the Python libraries os and python-dotenv. What is an...

PySpark - Change Column Types of a DataFrame

PySpark - Change Column Types of a DataFrame

Introduction Data manipulation tasks often involve converting column data types to ensure consistency and accuracy in analysis. In this tutorial, we will show you how to change column types of a PySpark DataFrame. In order to do this, we will use the cast() function of PySpark. Import Libraries First, we...

PySpark - Window Functions
Academy Membership PythonPySpark

PySpark - Window Functions

Introduction Window functions in PySpark are a powerful feature for data manipulation and analysis. They allow you to perform complex calculations on subsets of data within a DataFrame, without the need for expensive joins or subqueries. In this tutorial, we will show you how to use window functions in PySpark....

Pandas - Add an ID Column to a DataFrame

Pandas - Add an ID Column to a DataFrame

Introduction One common task when working with large datasets is the need to generate unique identifiers for each record. In this tutorial, we will explore how to easily add an ID column to a Pandas DataFrame. In order to do this, we use the index attribute of a Pandas DataFrame....

You’ve successfully subscribed to Deep Learning Nerds | The ultimate Learning Platform for AI and Data Science
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Success! Your email is updated.
Your link has expired
Success! Check your email for magic link to sign-in.