The Story of Long Short-term Memory (RNN)

2 min readOct 24, 2020

If the Sequence is matter, then LSTM is good for your Machine Learning Layer.

The Purpose of Recurrent neural network (RNN)

When we want to Predict sequential series of data. For examples sentences, cryptocurrency or stock. We can train the datasets using RNN.

LSTM on RNN

And then the layer that used to RNN training usually we use LSTM.

What is LSTM

We take a look at Sequential Series of data. Because it’s a Sequential so the Sequence of the series data is very important. And each of the series of data influence each other data.

It’s like a Fibonacci Sequence number, every number influence to other number.

In Fibonacci the previous of number affect the next of number. We can say Fibonacci can’t have 8 without 3 and 5. Also can’t have 13 without 5 or 8. So the 8 bring the 5 to get the 13.

To bring previous number into the next number we need a more than just short-term memory, we need longer short-term memory. Maybe that’s why it called Long Short-term Memory.

But Why LSTM for RNN?

A recurrent neural network is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. (source: wikipedia)

Obviously LSTM can handle the sequence of series data. Because LSTM have a longer short-term memory than other layers :P

Simple example : Let’s say we want to predict a word after a sentence “I know you loved me, so am i, I love you …” simply we can put word “too” because the sequence of words before, explain about that. And How machine put word “too”?

Let’s say we have a machine or program that simply put “too” after “I know you love me, so am i, I love you …”. That’s because the machine has be trained by some sentences before. On training process machine learn if it got “so am i” also have another sentence after that, it will put word “too”.

Yeaahh,, the Sequence training is very important for that. And this kind of training need more longer short-term memory, which is LSTM has that.

Speeed up LSTM with using CuDNNLSTM

Tensorflow provides CuDNNLSTM. If you use GPU for your training process, it’s better using CuDNNLSTMthan using LSTM. Using CuDNNLSTM instead, will speed up the training process, also the predict and evaluate process.

Since Tensorflow 2.0, LSTM layer automatically take the CuDNN version if you specify no activation function.