Term Structure Analysis
In this example, we will use the fredr package to retrieve one of the most popular series from the US Federal Reserve database (FRED). This is the spread between the returns on 10 year and 2 year constant maturity bonds.[1] In this example, we will use fredr which includes access to the basic series documentation. It requires that the user open a one-time account to obtain an API key. The new CANSIM package from Mountain Math will be used to retrieve the 10-year and 2-year benchmark bonds for Canada.
Packages USED
The key package used in this vignette is the fredr package which accesses the FRED API. Key packages include
- dplyr – for processing the tibble prior to conversion to xts
- ggplot2 – part of tidyverse for plotting the results
- cansim – package to retrieve metadata and series from Statistics Canada’s NDM
- broom, tbl2xts – utility packages for working with tibbles
- fredr – the retrieval package to be used.[2]
Retrieving the DATA
The first step in the analysis is to go the FRED site at https://fred.stlouisfed.org to open and account and obtain an API key. This is done only once and can be used for multiple R scripts. In this case, the key is saved in a CSV file in the folder being used for this example. It must be retrieved from this CSV file for each run to initialize the calls to the API.
#Retrieving from FRED and CANSIM
library(tidyverse)
library(fredr)
library(ggplot2)
library(xts)
library(tbl2xts)
setwd("D:\\OneDrive\\fred_test")
#set up the API key
user_api_key<-read.csv("fred_api_key.csv",stringsAsFactors=FALSE)
fredr_set_key(user_api_key$fredapi)
The next stage is to define the names for the series to be retrieved. The fredr package has search options which would facilitate the lookup of series ids. For simplicity, this was done in this example using the St. Louis web site. The next set of code retrieves the information about the series first. Each call to the routine fredr_series returns a tibble for one series. The routine fredr_series_observations returns the actual data as a tibble. The function tbl_xts from the package tbl2xts is used to translate the tibble to the xts time series construct.
us_series<-"t10y2y"
us_series_info<-fredr_series(us_series)
print(as.data.frame(us_series_info))
starting_point<-"2004-01-01"
us_data1<-fredr_series_observations(us_series,observation_start=as.Date(starting_point))
#start building up an xts data frame
us_xts1<-tbl_xts(us_data1)
colnames(us_xts1)<-c("us10y2y")
The next stage is to use the CANSIM procedure to retrieve two interest rates series and calculate the difference. The meta data for both series are retrieved to document the series in the log file. A Date column is constructed using a mutate command because the date, as retrieved from NDM, is not recognized as a regular date by R.
#now get Canadian data-we start in xts to get dates straight
cdn_irate_vbls<-c("v39055","v39051")
library(cansim)
meta1<-get_cansim_vector_info(cdn_irate_vbls)
print(data.frame(meta1))
cdn_irate_xts <- get_cansim_vector(cdn_irate_vbls,start_time=as.Date(starting_point)) %>%
mutate(Date=as.Date(REF_DATE)) %>%
tbl_xts(spread_by="VECTOR",cols_to_xts="VALUE")
tail(cdn_irate_xts)
#need to clean out the zero observations
cdn_irate_xts[rowSums(cdn_irate_xts)==0,]<-NA
#otherwise, we get an area plot
#calculate differences
cdn_yield<-cdn_irate_xts$v39055 - cdn_irate_xts$v39051
colnames(cdn_yield)<-c("cdn10y2y")
In the tbl_xts invocation, the spread_by is used to identify the variable which contains the series names. There are a number of data edits required. Any zeros should be filled with NA to avoid problems in the plot because zero is considered a legitimate data point. The resulting series is given the label cdn10y2y to match the rename of the US series above.
The next stage removes the NAs, particularly the trailing ones, so that a legitimate last date can be calculated to use in graph labelling. The US and Canadian series are merged using the tidy procedure from the broom package.
#calculate last date
#squeeze out nas
cdn_spread<-na.omit(cdn_yield$cdn10y2y)
last_date<-as.Date(time(tail(cdn_spread,1)))
print(last_date)
us_cdn_xts<-merge(us_xts1,cdn_yield)
#use broom and tidy to make a plot tibble
library(broom)
plot_data<-tidy(us_cdn_xts)
print(plot_data)
The final stage is to prepare the plot using the tibble as input. The economist_white theme is used from ggthemes as a convenience. The latter package has other interesting themes but there are numerous other examples available around the internet.
library(ggthemes)
term_plot<-ggplot(plot_data,aes(x=index,y=value,colour=series))+
theme_economist_white()+
theme(legend.position="bottom",legend.title=element_blank())+
labs(y="Difference in Percent",x=NULL,
title="Difference between 10-Year and 2-Year Bonds",
subtitle="Canada and US",
caption=paste("FRED: t10y2y","CANSIM:",paste(cdn_irate_vbls,collapse=" - "),"End:",last_date))+
scale_x_date(date_breaks="2 years",date_labels="%Y")+
geom_line()
ggsave(term_plot,file="term_plot.png")
The caption is noteworthy because the collapse option in paste is used to connect the two Canadian series using “ – “. The last date is also dynamically inserted. The scale_x_date is used because the X data are defined in R date format. If the date_labels is not used, the days will be shown because dates in R always include a day.
The resulting chart is shown below.
[1] https://fredblog.stlouisfed.org/2018/08/whats-up-or-down-with-the-yield-curve/
[2] https://www.rdocumentation.org/packages/fredr/versions/1.0.0