In 1997, Sepp Hochreiter and Jürgen Schmidhuber introduced Long Short-Term Memory (LSTM), a special type of Recurrent Neural Network (RNN). The motivation was to overcome the problems of traditional RNNs when dealing with long sequences. This is exactly what LSTM Networks (LSTMs) are supposed to do. The special architecture of LSTMs allows them to have a kind of memory, which makes it possible to deal very well with sequences that contain long-term dependencies. The concept of LSTM was a groundbreaking achievement in the field of Deep Learning and has become a fundamental part of Natural Language Processing (NLP) applications. In this tutorial we want to take a closer look at LSTMs and explore how exactly they work. We will particularly uncover the power of LSTMs and their unique architecture.

Challenges of RNNs

If you are not yet familiar with RNNs, check out our post about how RNNs work:

Deep Learning - How Recurrent Neural Networks (RNNs) work
Introduction In contrast to a classic feedforward network, a Recurrent Neural Network (RNN) allows backward connections. An RNN is particularly suitable for processing time series data and is able to take into account the time dependency of data. This enables an RNN to have a kind of short-term memory. The

Traditional RNNs are well suited for tasks where short-term dependencies need to be taken into account. However, they struggle when it comes to processing long sequences that contain long-term dependencies.

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In