Neural Network

Neural Network

A Neural Network emulates the structure of a human brain as a network of neurons that are interconnected to each other. Each neuron is technically equivalent to a logistic regression unit.

Neural network

In this setting, neurons are organized in multiple layers where every neuron at layer i connects to every neuron at layer i+1 and nothing else. The tuning parameters in a neural network include the number of hidden layers (commonly set to 1), the number of neurons in each layer (which should be same for all hidden layers and usually at 1 to 3 times the input variables), and the learning rate. On the other hand, the number of neurons at the output layer depends on how many binary outputs need to be learned. In a classification problem, this is typically the number of possible values at the output category.

The learning happens via an iterative feedback mechanism where the error of training data output is used to adjust the corresponding weights of input. This adjustment propagates to previous layers and the learning algorithm is known as "backpropagation." Here is an example:

> library(neuralnet)
    > nnet_iristrain <-iristrain
    > #Binarize the categorical output
    > nnet_iristrain <- cbind(nnet_iristrain, iristrain$Species == 'setosa')
    > nnet_iristrain <- cbind(nnet_iristrain, iristrain$Species == 'versicolor')
    > nnet_iristrain <- cbind(nnet_iristrain, iristrain$Species == 'virginica')
    > names(nnet_iristrain)[6] <- 'setosa'
    > names(nnet_iristrain)[7] <- 'versicolor'
    > names(nnet_iristrain)[8] <- 'virginica'
    > nn <- neuralnet(setosa+versicolor+virginica ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data=nnet_iristrain, hidden=c(3))
    > plot(nn)
    > mypredict <- compute(nn, iristest[-5])$net.result
    > # Consolidate multiple binary output back to categorical output
    > maxidx <- function(arr) {
    +     return(which(arr == max(arr)))
    + }
    > idx <- apply(mypredict, c(1), maxidx)
    > prediction <- c('setosa', 'versicolor', 'virginica')[idx]
    > table(prediction, iristest$Species)

    prediction   setosa versicolor virginica
    setosa         10          0         0
    versicolor      0         10         3
    virginica       0          0         7

Neural plot

Neural networks are very good at learning non-linear functions. They can even learn multiple outputs simultaneously, though the training time is relatively long, which makes the network susceptible to local minimum traps. This can be mitigated by doing multiple rounds and picking the best-learned model.