Understanding Loss Functions in Machine Learning
Loss functions in machine learning are mathematical functions. Furthermore, these functions measure the difference between what model predicts and what we want it to predict.
The purpose of these functions is to minimize this difference, which in turn improves the models accuracy.
Loss functions role depends also on what type of machine learning we’re dealing with.
In case, we’re working with supervised learning, we use this function to choose a particular predictor from a predefined family of predictors.
However, if we’re working with unsupervised learning, a loss function characterizes what data looks like. In other words, its our data model and is itself our primary goal.
Types of loss functions
There are also various types of loss functions in machine learning. Furthermore, which one we choose to use depends on the type of problem we want to solve.
Here’s a few of the most common ones we using in machine learning today.
Mean squared error
This one is one of the most common for models that solve regression problems. Further, it measures the average squared difference between the values your model predicts and actual ground truth values.
Cross-entropy loss functions
We use this loss function in classification problems. This one measures the dissimilarity between models predicted probability distribution and the actual probability distribution.
In other words, our model assigns a value to each class and all values from classes sum up to 1. Therefore, we use this loss function to make clear predictions where large majority of this sum belongs to only 1 class.
Hinge loss functions
We use Hinge loss function in cases where we need to solve a classification problem. Goal of this function is to maximize the margin between the decision boundary and the data points. It is common in support vector machine models.
In other words, hinge loss penalizes the model when it predicts the wrong class by increasing the magnitude of the error. When it predicts a value that is greater than 1, the loss function is 0, which indicates that the model is confident in its prediction.
However, if the model predicts values that are closer to 0, it means that it’s uncertain. Therefore, the penalty increases linearly as the predicted values approach to 0.
Kullback-Leibler Divergence
Kullback-Leibler divergence, which we also know as relative entropy, measures difference between 2 probability distributions. We can usually find it in generative models and unsupervised learning.
In other words, it measures how much information gets lost when we use the estimated distribution to approximate the true distribution.
Conclusion
To conclude, loss functions are essential in machine learning, which provide us a measure of how well our models are doing. Furthermore, there is a wide range of different functions we can choose from.
There are also more than the ones we mentioned above and which one we choose depends on the problem we’re trying to solve.
I hope this article helped you gain a better understanding about loss functions and their role in machine learning.