
Vanilla LSTM with numpy

October 8, 2017

The blog post updated in December, 2017 based on feedback from @AlexSherstinsky; Thanks!

This is a simple implementation of Long short-term memory (LSTM) module on numpy from scratch. This is for learning purposes. The network is trained with stochastic gradient descent with a batch size of 1 using AdaGrad algorithm (with momentum).

You can download the jupyter notebook from

The model usually reaches an error of about 45 after 5000 iterations when tested with 100,000 character sample from Shakespeare. However it sometimes get stuck in a local minima; reinitialize the weights if this happens.

You need to place the input text file as `input.txt` in the same folder as the python code.

--This is inspired from << character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy)>> by << Karpathy)>>.-- --The blog post updated in December, 2017 based on feedback from <<>>; Thanks!-- This is a simple implementation of Long short-term memory (LSTM) module on numpy from scratch. This is for learning purposes. The network is trained with stochastic gradient descent with a batch size of 1 using AdaGrad algorithm (with momentum). ! You can download the jupyter notebook from <<>> The model usually reaches an error of about 45 after 5000 iterations when tested with <<,000 character sample from Shakespeare)>>. However it sometimes get stuck in a local minima; reinitialize the weights if this happens. You need to place the input text file as `input.txt` in the same folder as the python code. <!> <<<html <script type="text/javascript"> function iframeLoaded() { var iframe = window.document.getElementById('numpy_lstm_iframe') if(iframe) { function setHeight() { iframe.height = iframe.contentWindow.document.body.scrollHeight + "px" } setTimeout(setHeight, 1000) setHeight() } } </script> <iframe id="numpy_lstm_iframe" onload="iframeLoaded()" src="ml/numpy_lstm_ipynb.html" style="width: 100%; border: none; outline: none; min-height: 640px;" />