Building Machine Learning Systems with Python
上QQ阅读APP看书,第一时间看更新

Visualizing the Lasso path

Using scikit-learn, we can easily visualize what happens as the value of the regularization parameter (alphas) changes. We will again use the Boston data, but now we will use the Lasso regression object:

las = Lasso() 
alphas = np.logspace(-5, 2, 1000)
alphas, coefs, _= las.path(x, y,
alphas=alphas)

For each value in alphas, the path method on the Lasso object returns the coefficients that solve the Lasso problem with that parameter value. Because the result changes smoothly with alpha, this can be computed very efficiently.

A typical way to visualize this path is to plot the value of the coefficients as alpha decreases. You can do so as follows:

fig,ax = plt.subplots() 
ax.plot(alphas, coefs.T)
# Set log scale
ax.set_xscale('log')
# Make alpha decrease from left to right
ax.set_xlim(alphas.max(), alphas.min())

This results in the following plot (we left out the trivial code that adds the axis labels and the title):

In this plot, the x axis shows decreasing amounts of regularization from left to right (alpha is decreasing). Each line shows how a different coefficient varies as alpha changes. The plot shows that when using very strong regularization (left side, very high alpha), the best solution is to have all values be exactly zero. As the regularization becomes weaker, one by one, the values of the different coefficients first shoot up, then stabilize. At some point, they all plateau as we are probably already close to the unpenalized solution.