Introduction to R for Quantitative Finance
上QQ阅读APP看书,第一时间看更新

Working with time series data

The native R classes suitable for storing time series data include vector, matrix, data.frame, and ts objects. But the types of data that can be stored in these objects are narrow; furthermore, the methods provided by these representations are limited in scope. Luckily, there exist specialized objects that deal with more general representation of time series data: zoo, xts, or timeSeries objects, available from packages of the same name.

It is not necessary to create time series objects for every time series analysis problem, but more sophisticated analyses require time series objects. You could calculate the mean or variance of time series data represented as a vector in R, but if you want to perform a seasonal decomposition using decompose, you need to have the data stored in a time series object.

In the following examples, we assume you are working with zoo objects because we think it is one of the most widely used packages. Before we can use zoo objects, we need to install and load the zoo package (if you have already installed it, you only need to load it) using the following command:

> install.packages("zoo")
> library("zoo")

In order to familiarize ourselves with the available methods, we create a zoo object called aapl from the daily closing prices of Apple's stock, which are stored in the CSV file aapl.csv. Each line on the sheet contains a date and a closing price separated by a comma. The first line contains the column headings (Date and Close). The date is formatted according to the recommended primary standard notation of ISO 8601 (YYYY-MM-DD). The closing price is adjusted for stock splits, dividends, and related changes.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

We load the data from our current working directory using the following command:

> aapl<-read.zoo("aapl.csv",+ sep=",", header = TRUE, format = "%Y-%m-%d")

To get a first impression of the data, we plot the stock price chart and specify a title for the overall plot (using the main argument) and labels for the x and y axis (using xlab and ylab respectively).

> plot(aapl, main = "APPLE Closing Prices on NASDAQ",+ ylab = "Price (USD)", xlab = "Date")

We can extract the first or last part of the time series using the following commands:

> head(aapl)
2000-01-03 2000-01-04 2000-01-05 2000-01-06 2000-01-07 2000-01-10
 27.58 25.25 25.62 23.40 24.51 24.08
> tail(aapl)
2013-04-17 2013-04-18 2013-04-19 2013-04-22 2013-04-23 2013-04-24
 402.80 392.05 390.53 398.67 406.13 405.46

Apple's all-time high and the day on which it occurred can be found using the following command:

> aapl[which.max(aapl)]
2012-09-19
 694.86

When dealing with time series, one is normally more interested in returns instead of prices. This is because returns are usually stationary. So we will calculate simple returns or continuously compounded returns (in percentage terms).

> ret_simple <- diff(aapl) / lag(aapl, k = -1) * 100
> ret_cont <- diff(log(aapl)) * 100

Summary statistics about simple returns can also be obtained. We use the coredata method here to indicate that we are only interested in the stock prices and not the index (dates).

> summary(coredata(ret_simple))
 Min. 1st Qu. Median Mean 3rd Qu. Max.
-51.86000 -1.32500 0.07901 0.12530 1.55300 13.91000

The biggest single-day loss is -51.86%. The date on which that loss occurred can be obtained using the following command:

> ret_simple[which.min(ret_simple)]
2000-09-29
 -51.85888

A quick search on the Internet reveals that the large movement occurred due to the issuance of a profit warning. To get a better understanding of the relative frequency of daily returns, we can plot the histogram. The number of cells used to group the return data can be specified using the break argument.

> hist(ret_simple, breaks=100, main = "Histogram of Simple Returns",+ xlab="%")

We can restrict our analysis to a subset (a window) of the time series. The highest stock price of Apple in 2013 can be found using the following command lines:

> aapl_2013 <- window(aapl, start = '2013-01-01', end = '2013-12-31')
> aapl_2013[which.max(aapl_2013)]
2013-01-02 
 545.85

The quantiles of the return distribution are of interest from a risk-management perspective. We can, for example, easily determine the 1 day 99% Value-at-Risk using a naive historical approach.

> quantile(ret_simple, probs = 0.01)
 1% 
-7.042678

Hence, the probability that the return is below 7% on any given day is only 1%. But if this day occurs (and it will occur approximately 2.5 times per year), 7% is the minimum amount you will lose.