LSTM
Sepp Hochreiter & Jürgen Schmidhuber, 1997
O(T·d²)Proposed by Hochreiter and Schmidhuber in 1997, the Long Short-Term Memory network solved the vanishing gradient problem that plagued earlier recurrent networks. The visualization shows a single LSTM cell unrolled across time steps, with the cell state flowing as a highway across the top. At each step, three gates control information flow: the forget gate (what to erase), input gate (what to write), and output gate (what to read). Gate activation levels are shown as fill bars, and a character sequence flows through one step at a time.