LSTM Networks in Details
Today I’d like to dig deeper into the details of LSTM algorithm. Below is a flow chart of LSTM networks. LSTM were designed specifically to overcome the long-term dependency problem faced by recurrent neural networks RNNs (due to the vanishing gradient problem). LSTMs have feed back connections which make them different to more traditional feed forward neural networks. This property enables LSTMs to process entire sequences of data (e.g. time series) without treating each point in the sequence independently, but rather, retaining useful information about previous data in the sequence to help with the processing of new data points. As a result, LSTMs are particularly good at processing sequences of data such as text, speech and general time-series. LSTMs use a series of ‘gates’ to control how the information in a sequence of data comes into, is stored in, and leaves the network. There are three gates in a typical LSTM: forget gate, input gate, and outp...