VARUNA JAYASIRI

@vpj

Vanilla LSTM with numpy

October 8, 2017

The blog post updated in December, 2017 based on feedback from @AlexSherstinsky; Thanks!

This is a simple implementation of Long short-term memory (LSTM) module on numpy from scratch. This is for learning purposes. The network is trained with stochastic gradient descent with a batch size of 1 using AdaGrad algorithm (with momentum).

http://blog.varunajayasiri.com/ml/lstm.svg

You can download the jupyter notebook from http://blog.varunajayasiri.com/ml/numpy_lstm.ipynb

The model usually reaches an error of about 45 after 5000 iterations when tested with 100,000 character sample from Shakespeare. However it sometimes get stuck in a local minima; reinitialize the weights if this happens.

You need to place the input text file as `input.txt` in the same folder as the python code.

--This is inspired from <<https://gist.github.com/karpathy/d4dee566867f8291f086(Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy)>> by <<https://github.com/karpathy(Andrej Karpathy)>>.-- --The blog post updated in December, 2017 based on feedback from <<https://twitter.com/AlexSherstinsky(@AlexSherstinsky)>>; Thanks!-- This is a simple implementation of Long short-term memory (LSTM) module on numpy from scratch. This is for learning purposes. The network is trained with stochastic gradient descent with a batch size of 1 using AdaGrad algorithm (with momentum). !http://blog.varunajayasiri.com/ml/lstm.svg You can download the jupyter notebook from <<http://blog.varunajayasiri.com/ml/numpy_lstm.ipynb>> The model usually reaches an error of about 45 after 5000 iterations when tested with <<http://cs.stanford.edu/people/karpathy/char-rnn/shakespear.txt(100,000 character sample from Shakespeare)>>. However it sometimes get stuck in a local minima; reinitialize the weights if this happens. You need to place the input text file as `input.txt` in the same folder as the python code. <!> <<<html <script type="text/javascript"> function iframeLoaded() { var iframe = window.document.getElementById('numpy_lstm_iframe') if(iframe) { function setHeight() { iframe.height = iframe.contentWindow.document.body.scrollHeight + "px" } setTimeout(setHeight, 1000) setHeight() } } </script> <iframe id="numpy_lstm_iframe" onload="iframeLoaded()" src="ml/numpy_lstm_ipynb.html" style="width: 100%; border: none; outline: none; min-height: 640px;" />