Machine Learning Solutions
上QQ阅读APP看书,第一时间看更新

Testing the baseline model

In this section, we will be implementing our testing approach so that we can evaluate our model's accuracy. We will first generate the output prediction and then we'll start testing it. We will be implementing the following steps here:

  1. Generating and interpreting the output
  2. Generating the score
  3. Visualizing the output

Generating and interpreting the output

To generate the prediction, we are using the treeinterpreter library. We are predicting the price value for each of our testing dataset records using the following code:

Generating and interpreting the output

Figure 2.26: Code snippet for generating the prediction

Here, prediction is the array in which we have elements that are the corresponding predicted adj close price for all records of the testing dataset. Now, we will compare this predicted output with the actual adj close price of the testing dataset. By doing this, we will get to know how accurately our first model is predicting the adj close price. In order to evaluate further, we will generate the accuracy score.

Generating the accuracy score

In this section, we will generate the accuracy score as per the equations provided in the default testing matrix section. The code for this is as follows:

Generating the accuracy score

Figure 2.27: Code snippet for generating the score for the test dataset

As you can see from the preceding code snippet, our model is not doing too well. At this point, we don't know what mistakes we've made or what went wrong. This kind of situation is common when you are trying to solve or build an ML model. We can grasp the problem better using visualization techniques.

Visualizing the output

We will be using the visualization graph in this section. Using the graph, we will identify the kind of error we have committed so that we can fix that error in the next iteration. We will plot a graph where the y-axis represents the adj close prices and the x-axis represent the dates. We plot the actual prices and predicted prices on the graph so that we will get a brief idea about how our algorithm is performing. We will use the following code snippet to generate the graph:

Visualizing the output

Figure 2.28: Code snippet for generating graph for predicted prices vs actual prices.

As you can see from the preceding graph, the top single line (orange color) represents the actual price and the messy spikes (blue color) below the line represent the predicted prices. From this plot, we can summarize that our model can't predict the proper prices. Here, you can see that the actual prices and predicted prices are not aligned with each other. We need to fix this issue. There are some techniques that we can try, such as alignment, smoothing, and trying a different algorithm. So, let's cover the problems of this approach in the next section.

Note

You can access the entire code on this topic from the GitHub link at https://github.com/jalajthanaki/stock_price_prediction/blob/master/Stock_Price_Prediction.ipynb.