Testing the baseline model
In this section, we will be implementing our testing approach so that we can evaluate our model's accuracy. We will first generate the output prediction and then we'll start testing it. We will be implementing the following steps here:
- Generating and interpreting the output
- Generating the score
- Visualizing the output
Generating and interpreting the output
To generate the prediction, we are using the treeinterpreter
library. We are predicting the price value for each of our testing dataset records using the following code:
Here, prediction is the array in which we have elements that are the corresponding predicted adj close price for all records of the testing dataset. Now, we will compare this predicted output with the actual adj close price of the testing dataset. By doing this, we will get to know how accurately our first model is predicting the adj close price. In order to evaluate further, we will generate the accuracy score.
Generating the accuracy score
In this section, we will generate the accuracy score as per the equations provided in the default testing matrix section. The code for this is as follows:
As you can see from the preceding code snippet, our model is not doing too well. At this point, we don't know what mistakes we've made or what went wrong. This kind of situation is common when you are trying to solve or build an ML model. We can grasp the problem better using visualization techniques.
Visualizing the output
We will be using the visualization graph in this section. Using the graph, we will identify the kind of error we have committed so that we can fix that error in the next iteration. We will plot a graph where the y-axis represents the adj close prices and the x-axis represent the dates. We plot the actual prices and predicted prices on the graph so that we will get a brief idea about how our algorithm is performing. We will use the following code snippet to generate the graph:
As you can see from the preceding graph, the top single line (orange color) represents the actual price and the messy spikes (blue color) below the line represent the predicted prices. From this plot, we can summarize that our model can't predict the proper prices. Here, you can see that the actual prices and predicted prices are not aligned with each other. We need to fix this issue. There are some techniques that we can try, such as alignment, smoothing, and trying a different algorithm. So, let's cover the problems of this approach in the next section.
Note
You can access the entire code on this topic from the GitHub link at https://github.com/jalajthanaki/stock_price_prediction/blob/master/Stock_Price_Prediction.ipynb.