An Introduction of Recurrent Neural Networks
Last time I have explained what is statistical arbitrage strategy that is applied by quantitative hedge funds. I mentioned the machine learning algorithms such as Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM), which are used as foundation to predict action of stock price. This time I will dig deeper and explore what is RNN (mainly) and LSTM, and how they work in practice.
Normally when learning about machine learning, we start with
linear and multi-linear regression. These are fundamental concepts for building
a Feed-Forward Neural Networks, which allows information to flow only in the
forward direction, from input layer, through hidden layers, and to the output layer.
There are no cycles or loops in the network. Below is the flow chart of a
feed-forward neural network:
The problem with FFN is that decisions are based on the
current input. It does not memorize the past data, so there is no future scope.
That’s why it is used in general regression and classification problems and
certainly can’t be applied on predicting stock price.
Don’t worry, this is where RNN comes to play. First question
is: why we use RNN or LSTM (a derivative of RNN) in predicting stock price? If
we assume that stock prices in every second as a sequential or time series
data, and traditional feedforward networks cannot be used for learning and
prediction, we need a mechanism that can retain past or historical
information to forecast the future price. That’s why we use Recurrent
neural networks algorithm to deal with time series data.
Let’s focus on intuition on RNN first. For example, we took
a shot of ball moving in time like the picture below. We also want to predict
where the ball is moving.
Well, we can make guesses. However, these will just be any random
guesses and would not mean much to data scientist. Without the knowledge where
it has been, we do not have data where the ball is going.
Let’s try this: take multiple snapshots of the ball moving.
Now it seems we have enough information to make better
prediction because it at least looks like a sequential data now, imagining
every ball stores its information or data at that split second. For the
sequential series, all data about the ball at a specific time is connected to
its last state and provides input for next state. This is the basic logic why
RNN is good at processing sequence data for predictions.
Still
question: How it works in practice?
Let’s bring a concept of sequential memory, which is how
human brain works. Let’s count all the alphabets. Pretty easy, right? what if
we do that in reversed order in our brain? It’s workable, but a little harder. This
is sequential memory because we are learning alphabets in a sequence, and
sequential memory is a mechanism easier for our brains to recognize sequence
patterns.
If you look at the chart above, the left-hand side is
traditional Feedforward networks, whereas the right-hand side is the Recurrent
Neural Networks. The loop in the RNN is a representation of saving the previous
input so that in any given time t, the input in the RNN is input(t) +
input(t-1), its previous data.
This is a code snippet to show the general workflow of RNN.
Problem
with RNN
If we let RNN algorithm to read in a sentence as the picture
shows. We noticed that the distribution of color (represent the memory in each
looping) is shifting towards the most recent string, which is the question mark,
and the memory for string “what” has become negligible in the most recent state.
Therefore, as RNN process more steps, it has trouble retaining the information from previous step. Thus, it does not have enough info to learn about long term data. In other words, it suffers from short-term memory.
LSTM
Theoretically speaking, we can utilize LSTM to store amount
of past historical data to predict future price action. It sounds perfect, right?
However, LSTM are only used in High Frequency trading, which means that it has
an acceptable success rate on predicting stock price in next nanoseconds, not
even talking about minutes. In other words, that’s the current computing capabilities
to foresee the “future” in the stock market.
Conclusion
I am very fortunate to discuss these topics with some of the
best professionals in the industry and gain their insights about applications
of machine learning and AI in this complicated, never-ending project. However, I
am always optimistic about the future. If one day, the prediction power
improved from milliseconds to minutes(mid-frequency), weas human has moved
forward again in the development of history. At last, special thanks to these
online contributors who help me understand the basic knowledge of these cool
stuffs!
References:
https://www.simplilearn.com/tutorials/deep-learning-tutorial/rnn
https://machinelearningmastery.com/calculus-in-action-neural-networks/
https://www.youtube.com/watch?v=LHXXI4-IEns
Comments
Post a Comment