A primer on Artificial Neural Nets

Neural Networks (NNs), also known as Artificial Neural Networks (ANNs), are a specific subset of Machine Learning (ML). Deep Learning algorithms employ deep ANNs in order to exploit huge amounts of data to perform impressive tasks, often related to image and speech data, e.g. object detection and machine translation. Neural Networks were initially inspired by the human brain: the building blocks are called Neurons and are a crude approximation of the billions of neurons that constitute our nervous system.
An artificial neuron basically performs a weighted sum of its inputs and passes the result to an activation function to add nonlinearity before passing the result to its output neurons.
Neurons are organized in layers: a Neural Network with many layers is called deep, while a Net with few layers is called shallow.
In fully connected NNs the neurons of each layer are connected to every neuron of the next layer. Training data is passed from the first layer to the last to finally generate an output; the output is then used to compute a loss function which measures the goodness of the prediction compared to the ground truth (the lower the value the better).

The error is backpropagated in the opposite direction and the weights are updated using gradient descent in order go towards the minimum of the loss function. After many iterations, the training is complete and the weights are tuned: the Neural Network is ready to make predictions on new data.

A deep enough fully connected Neural Network could theoretically approximate any function, no matter its complexity. Unfortunately there is a tradeoff: increasing the number of neurons and layers we increase the number of learnable weights incurring in overfitting (our model learns perfectly the training data but will not have good performances on new data) and increasing the computational complexity. Different techniques have been developed to increase the efficiency of neural networks: different architectures (how the neurons are arranged in the net), different activation functions, etc.

Convolutional Neural Networks (CNNs) are a specific family of ANNs inspired by the visual cortex that reduces the number of weights needed to pick up patterns by using partially connected neurons and shared weights. Convolutional layers are composed of a series of “filters” with shared weights and each “filter” will learn a specific pattern. Additionally, each “filter” will be applied to the whole input (the convolution operation), resulting in a reduced number of weights to pick up a specific pattern over the whole input space; the result of applying a filter to the input is a feature map. Each neuron in a filter will be connected only to a specific subset of neurons in the next layer. An input image will get smaller and smaller along the network but it also gets deeper and deeper (more feature maps).

Neurons in the first convolutional layer are not connected to every single pixel in the input image (like in a fully connected NN layer), but only to pixels in their receptive fields. In turn, each neuron in the second convolutional layer is connected only to neurons located within a small rectangle in the first layer. This architecture allows the network to focus on small low-level features in the first hidden layer, then assemble them into larger higher-level features in the next hidden layer, and so on.

In CRIMSON we are experimenting both with Convolutional Neural Networks and Recurrent Neural Networks (another family of ANNs, usually used for sequential data) in order to remove the non resonant background from CARS spectra.