# How to Use Lasso Regression: Example With Python

Lasso regression is one of the various techniques we can use to improve our predictive models.

Further in this article, we’ll delve into its benefits, and when to use it over other techniques.

## What is Lasso Regression?

Lasso, or **Least Absolute Shrinkage and Selection Operator**, is a linear regression model with a L1 regularization term.

To clarify, this term imposes a penalty on the absolute values of the coefficients, leading to a sparse model.

In other words, it tends to reduce the coefficients of irrelevant features to zero, effectively performing **feature selection**.

## When to Use Lasso Regression?

Choose when:

- Your dataset has a large number of features.
- You suspect some features may be irrelevant.
- You want to prevent overfitting.
- You desire a model with better interpretability.

## Pros and Cons

### Pros:

- Performs feature selection, leading to a simpler and more interpretable model.
- Reduces overfitting by shrinking coefficients.
- Helps handle multicollinearity by selecting only one feature from a group of correlated features.

### Cons:

- Can be sensitive to outliers.
- May underestimate the coefficients of important features.

## Normalizing Data

Before applying it, it’s crucial to normalize your data.

The purpose of this is so we ensure that all features are on the same scale and that we apply the regularization term uniformly across all coefficients.

## Example With Python

Following code snippet shows how to apply lasso regression using sci-kit learn library and python. Furthermore, we’re going to demonstrate it on housing dataset, which we’re going to download from Kaggle.

```
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from kaggle.api.kaggle_api_extended import KaggleApi
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
#authenticate API connection with Kaggle
api = KaggleApi()
api.authenticate()
#download the housing dataset from https://www.kaggle.com/datasets/yasserh/housing-prices-dataset
api.dataset_download_file(
'yasserh/housing-prices-dataset',
file_name='housing.csv',
path='datasets'
)
#import dataset and remove rows that have missing values, if there are any
df = pd.read_csv('datasets/Housing.csv')
df.dropna()
print(df.head())
#split dataset to dependent and independent values for linear regression
independent_df = df.iloc[:,1:5]
bool_categories = ['mainroad', 'guestroom', 'basement', 'prefarea', 'hotwaterheating', 'airconditioning']
for cat in bool_categories:
independent_df[cat] = df[cat].astype('category').cat.codes
print(independent_df)
#turn off pandas warning - doesn't effect the result, just cleans the console output
pd.set_option('mode.chained_assignment', None)
dependent_df = df[['price']]
dependent_df['price_log'] = np.log(dependent_df['price'] + 1)
print(dependent_df)
X = independent_df
y = dependent_df['price']
#split dataset into training and testing partitions
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
pipeline = Pipeline([
('std_scalar', StandardScaler())
])
X_train = pipeline.fit_transform(X_train)
X_test = pipeline.transform(X_test)
#import the model and train it
model = Lasso(
alpha=0.1,
precompute=True,
warm_start=True,
positive=True,
selection='random',
random_state=42
)
model.fit(X_train, y_train)
#make predictions on the test data
y_pred = model.predict(X_test)
#evaluate the results using MSE and R2
#lower MSE indicates better performance
#higher R2 indicates better performance (0 - 1 range)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f'Mean Squared Error: {mse:.2f}')
print(f'R2 Score: {r2:.2f}')
```

## Comparing Lasso and Ridge Regression

Both are regularization techniques to improve linear regression models. While ridge regression uses L2 regularization to shrink coefficients, lasso regression employs L1 regularization to both shrink and zero-out coefficients.

The choice between the two depends on your specific problem and whether feature selection is an essential requirement.

## Conclusion

In conclusion, it’s a powerful technique that helps improve model performance by addressing overfitting and performing feature selection.

So by understanding its advantages and limitations, you can decide when to use this method for your machine learning projects.