Understanding Fully Connected Neural Networks
Fully connected neural networks are a type of artificial neural networks that consist of multiple layers of interconnected neurons. In machine learning lingo, we also call them dense neural networks.
This type of neural networks have been useful in many various applications. Some of which in image and text classification, speech recognition, natural language processing and more.
Structure of fully connected neural networks
In a dense neural network, each neuron in one layer is connected to every neuron in the following layer. However, neurons inside the same layer are not connected between each other.
First layer is the input layer, where we need to adapt the number of neurons to the shape of our input data. And the last layer is the output layer, which shape depends on the purpose of the neural network.
Layers in between are hidden layers, which are largely responsible for mapping input to output data.
In order for a neural network to learn, each connection between neurons holds a weight value. Furthermore, these values are constantly changing through training process to minimize the difference between a predicted value and the actual output.
When data passes through each neuron, it goes through an activation function. Furthermore, this function determines the output of a neuron by mapping its input. In other words, it controls these values so they don’t get out of hand and become unusable.
Most common activation functions we use today are sigmoid, hyperbolic tangent and rectified linear unit (ReLU) functions.
Sigmoid function is a smooth curve that maps any input to a value between 0 and 1.
Hyperbolic tangent function is similar to sigmoid function, the only difference being that it maps input values between -1 and 1.
ReLU function is a piecewise linear function that returns the input if it is positive and 0 for all negative values.
Limitations
Fully connected neural networks can suffer from the problem of overfitting. In other words, the model becomes too specialized on the training data. This in turn causes it to perform poorly on new, unseen data.
In order to mitigate this problem, we can use regularization techniques such as L1 and L2 regularization. These techniques add a penalty term to the loss function which encourages the network to learn simpler weights.
Another method we can use to mitigate this problem is dropout technique. As its name might suggest, it randomly drops out some of the neurons in the network during training.
This will cause a model to train weights more homogenously throughout its whole structure.
Conclusion
To conclude, fully connected neural networks are a powerful tool in machine learning, which we can find in a wide range of applications. Furthermore, despite of their shortcomings, they still remain one of the essential building blocks for state of the art systems today.
I hope this article helped you gain a better understanding about neural networks and perhaps even inspired you to learn even more.