Term Structure Analysis

In this example, we will use the fredr package to retrieve one of the most popular series from the US Federal Reserve database (FRED). This is the spread between the returns on 10 year and 2 year constant maturity bonds.[1] In this example, we will use fredr which includes access to the basic series documentation. It requires that the user open a one-time account to obtain an API key. The new CANSIM package from Mountain Math will be used to retrieve the 10-year and 2-year benchmark bonds for Canada.

Packages USED

The key package used in this vignette is the fredr package which accesses the FRED API. Key packages include

dplyr – for processing the tibble prior to conversion to xts
ggplot2 – part of tidyverse for plotting the results
cansim – package to retrieve metadata and series from Statistics Canada’s NDM
broom, tbl2xts – utility packages for working with tibbles
fredr – the retrieval package to be used.[2]

Retrieving the DATA

The first step in the analysis is to go the FRED site at https://fred.stlouisfed.org to open and account and obtain an API key. This is done only once and can be used for multiple R scripts. In this case, the key is saved in a CSV file in the folder being used for this example. It must be retrieved from this CSV file for each run to initialize the calls to the API.

#Retrieving from FRED and CANSIM

library(tidyverse)

library(fredr)

library(ggplot2)

library(xts)

library(tbl2xts)

setwd("D:\\OneDrive\\fred_test")

#set up the API key

user_api_key<-read.csv("fred_api_key.csv",stringsAsFactors=FALSE)

fredr_set_key(user_api_key$fredapi)

The next stage is to define the names for the series to be retrieved. The fredr package has search options which would facilitate the lookup of series ids. For simplicity, this was done in this example using the St. Louis web site. The next set of code retrieves the information about the series first. Each call to the routine fredr_series returns a tibble for one series. The routine fredr_series_observations returns the actual data as a tibble. The function tbl_xts from the package tbl2xts is used to translate the tibble to the xts time series construct.

us_series<-"t10y2y"

us_series_info<-fredr_series(us_series)

print(as.data.frame(us_series_info))

starting_point<-"2004-01-01"

us_data1<-fredr_series_observations(us_series,observation_start=as.Date(starting_point))

#start building up an xts data frame

us_xts1<-tbl_xts(us_data1)

colnames(us_xts1)<-c("us10y2y")

The next stage is to use the CANSIM procedure to retrieve two interest rates series and calculate the difference. The meta data for both series are retrieved to document the series in the log file. A Date column is constructed using a mutate command because the date, as retrieved from NDM, is not recognized as a regular date by R.

#now get Canadian data-we start in xts to get dates straight

cdn_irate_vbls<-c("v39055","v39051")

library(cansim)

meta1<-get_cansim_vector_info(cdn_irate_vbls)

print(data.frame(meta1))

cdn_irate_xts <- get_cansim_vector(cdn_irate_vbls,start_time=as.Date(starting_point)) %>%

mutate(Date=as.Date(REF_DATE)) %>%

tbl_xts(spread_by="VECTOR",cols_to_xts="VALUE")

tail(cdn_irate_xts)

#need to clean out the zero observations

cdn_irate_xts[rowSums(cdn_irate_xts)==0,]<-NA

#otherwise, we get an area plot

#calculate differences

cdn_yield<-cdn_irate_xts$v39055 - cdn_irate_xts$v39051

colnames(cdn_yield)<-c("cdn10y2y")

In the tbl_xts invocation, the spread_by is used to identify the variable which contains the series names. There are a number of data edits required. Any zeros should be filled with NA to avoid problems in the plot because zero is considered a legitimate data point. The resulting series is given the label cdn10y2y to match the rename of the US series above.

The next stage removes the NAs, particularly the trailing ones, so that a legitimate last date can be calculated to use in graph labelling. The US and Canadian series are merged using the tidy procedure from the broom package.

#calculate last date

#squeeze out nas

cdn_spread<-na.omit(cdn_yield$cdn10y2y)

last_date<-as.Date(time(tail(cdn_spread,1)))

print(last_date)

us_cdn_xts<-merge(us_xts1,cdn_yield)

#use broom and tidy to make a plot tibble

library(broom)

plot_data<-tidy(us_cdn_xts)

print(plot_data)

The final stage is to prepare the plot using the tibble as input. The economist_white theme is used from ggthemes as a convenience. The latter package has other interesting themes but there are numerous other examples available around the internet.

library(ggthemes)

term_plot<-ggplot(plot_data,aes(x=index,y=value,colour=series))+

theme_economist_white()+

theme(legend.position="bottom",legend.title=element_blank())+

labs(y="Difference in Percent",x=NULL,

title="Difference between 10-Year and 2-Year Bonds",

subtitle="Canada and US",

caption=paste("FRED: t10y2y","CANSIM:",paste(cdn_irate_vbls,collapse=" - "),"End:",last_date))+

scale_x_date(date_breaks="2 years",date_labels="%Y")+

geom_line()

ggsave(term_plot,file="term_plot.png")

The caption is noteworthy because the collapse option in paste is used to connect the two Canadian series using “ – “. The last date is also dynamically inserted. The scale_x_date is used because the X data are defined in R date format. If the date_labels is not used, the days will be shown because dates in R always include a day.

The resulting chart is shown below.

term plot

[1] https://fredblog.stlouisfed.org/2018/08/whats-up-or-down-with-the-yield-curve/

[2] https://www.rdocumentation.org/packages/fredr/versions/1.0.0

R Articles

Indexed Access to NDM Series in R

Gender-Based Wage Rate Charts With Cansim Using R

Seasonal Adjustment Techniques with R

Analysing Financial Indicators from the OECD

Analyzing Health Expenditure Data From The OECD With R

Examining Canada and US Term Structures in R

Term Structure Analysis

Packages USED

Retrieving the DATA