Recurrent neural network

A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to the uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs^[1]^[2]^[3] makes them applicable to tasks such as unsegmented, connected handwriting recognition^[4] or speech recognition.^[5]^[6] The term "recurrent neural network" is used to refer to the class of networks with an infinite impulse response, whereas "convolutional neural network" refers to the class of finite impulse response. Both classes of networks exhibit temporal dynamic behavior.^[7] A finite impulse recurrent network is a directed acyclic graph that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a directed cyclic graph that cannot be unrolled.

Not to be confused with recursive neural network.

Additional stored states and the storage under direct control by the network can be added to both infinite-impulse and finite-impulse networks. Another network or graph can also replace the storage if that incorporates time delays or has feedback loops. Such controlled states are referred to as gated states or gated memory and are part of long short-term memory networks (LSTMs) and gated recurrent units. This is also called Feedback Neural Network (FNN). Recurrent neural networks are theoretically Turing complete and can run arbitrary programs to process arbitrary sequences of inputs.^[8]

$x_{t}$ : input vector

$h_{t}$ : hidden layer vector

$y_{t}$ : output vector

$W$ , $U$ and $b$ : parameter matrices and vector

$\sigma _{h}$ and $\sigma _{y}$ :

Activation functions

Each weight encoded in the chromosome is assigned to the respective weight link of the network.

The training set is presented to the network which propagates the input signals forward.

The mean-squared error is returned to the fitness function.

This function drives the genetic selection process.

Related fields and models[edit]

RNNs may behave chaotically. In such cases, dynamical systems theory may be used for analysis.

They are in fact recursive neural networks with a particular structure: that of a linear chain. Whereas recursive neural networks operate on any hierarchical structure, combining child representations into parent representations, recurrent neural networks operate on the linear progression of time, combining the previous time step and a hidden representation into the representation for the current time step.

In particular, RNNs can appear as nonlinear versions of finite impulse response and infinite impulse response filters and also as a nonlinear autoregressive exogenous model (NARX).^[92]

The effect of memory-based learning for the recognition of sequences can also be implemented by a more biological-based model which uses the silencing mechanism exhibited in neurons with a relatively high frequency spiking activity.^[93]

Apache Singa

: Created by the Berkeley Vision and Learning Center (BVLC). It supports both CPU and GPU. Developed in C++, and has Python and MATLAB wrappers.

Caffe

: Fully in Python, production support for CPU, GPU, distributed training.

Chainer

: Deep learning in Java and Scala on multi-GPU-enabled Spark.

Deeplearning4j

: includes interfaces for RNNs, including GRUs and LSTMs, written in Julia.

Flux

: High-level API, providing a wrapper to many other deep learning libraries.

Keras

Microsoft Cognitive Toolkit

: an open-source deep learning framework used to train and deploy deep neural networks.

MXNet

: Tensors and Dynamic neural networks in Python with GPU acceleration.

PyTorch

: Apache 2.0-licensed Theano-like library with support for CPU, GPU and Google's proprietary TPU,^[94] mobile

TensorFlow

: A deep-learning library for Python with an API largely compatible with the NumPy library.

Theano

: A scientific computing framework with support for machine learning algorithms, written in C and Lua.

Torch

^[25]

Machine translation

^[95]

Robot control

^[96]^[97]^[98]

Time series prediction

^[99]^[17]^[100]

Speech recognition

^[101]

Speech synthesis

^[102]

Brain–computer interfaces

Time series anomaly detection

[103]

^[104]

Text-to-Video model

Rhythm learning

[105]

Music composition

[106]

Grammar learning^[52]^[108]

[107]

^[109]^[110]

Handwriting recognition

Human action recognition

[111]

Protein homology detection

[112]

Predicting subcellular localization of proteins

[58]

Several prediction tasks in the area of business process management

[113]

Prediction in medical care pathways

[114]

Predictions of fusion plasma disruptions in reactors (Fusion Recurrent Neural Network (FRNN) code)

[115]

Applications of recurrent neural networks include:

Mandic, Danilo P.; Chambers, Jonathon A. (2001). Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability. Wiley. 978-0-471-49517-8.

ISBN

with over 60 RNN papers by Jürgen Schmidhuber's group at Dalle Molle Institute for Artificial Intelligence Research

Recurrent Neural Networks

for WEKA