## Introduction

Traditional **RNNs**, limited by their simplistic structure, have problems retaining information over longer time periods, leading to the infamous vanishing gradient problem. **Long Short-Term Memory (LSTM) Networks** have the impressive ability to capture and preserve long-term dependencies in sequential data. But how is an **LSTM** able to do this? What happens inside an **LSTM** cell? In this tutorial, we take a journey through the inside of an **LSTM** cell and investigate what exactly happens there. We take a step-by-step look at the math underlying an **LSTM** cell and unlock the sophisticated equations that control its gates, memory cell and output.

## Basics

In the following post we have already explained how an **LSTM** cell is structured. We looked at its architecture and at the individual gates. Be sure to check out the post if you are not yet familiar with the basics of an **LSTM** cell.

## Mathematical View

Now, let's go one step further and analyze step by step what goes on inside an **LSTM** cell. We will explain what happens at the individual gates and take a detailed look at the mathematical functions.

The following illustration shows the inside of an **LSTM** cell.

### Forget Gate, Input Gate and Output Gate

Let's examine where to find the **Forget Gate**, the **Input Gate **and the **Output Gate** in the **LSTM** cell:

To fully understand the process, we will go through the illustration step by step and explain what happens at each point from a mathematical perspective.

### Step 1: Determining which information should be forgotten

In the first step, it is decided which information should be removed from the previous cell state.

## You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In