We seek to answer these questions:

How do you measure the accuracy of an interest rate risk simulation technique?

Given that measure of accuracy, how many risk factors are necessary?

How does accuracy change as the number of factors increases?
Conclusion: Using data from the Board of Governors of the Federal Reserve and the U.S. Department of the Treasury, we use daily data from January 1962 to reach three conclusions. First, at least nine factors are necessary to accurately model U.S. Treasury yield curve movements when simulating at quarterly intervals like the Federal Reserve’s 2014 Comprehensive Capital Analysis and Review projections. Second, this is consistent with the Bank for International Settlements December 2010 revision of its market risk framework, which calls for at least six factors for modeling interest rate risk. Third, the one factor models in common use at U.S. financial institutions rank very low from an accuracy perspective and dramatic enhancements in systems capability are essential for safety and soundness in the financial services business.
Introduction
The Board of Governors of the Federal Reserve makes daily U.S. Treasury yield data available on its website beginning with data from 1962. The perils of the current fixed income environment in U.S. Treasuries is obvious in this graph of movements in the 1 year Treasury bill yield and movements in 10 and 30 year U.S. Treasury bond yields:
Some investors are strangely calm, regarding the current environment as one that will persist for a long time “because the Federal Reserve has no choice.” Other Investors are fleeing bond funds in record numbers. Federal Reserve governor Daniel Tarullo recently expressed his concern about interest rate risk at major bank holding companies and their “reach for yield.” At a moment of great danger, the interest rate risk modeling infrastructure at many large U.S. banks is both old and grossly inadequate for an accurate measure of the risk facing all fixed income investors today. Most of the models employed assume that only one factor drives interest rate risk, and (as commonly employed) this implies that either all rates will move up or down together. Banks are implicitly assuming that yield curve twists will never happen, even though U.S. data shows clearly that yield curve twists are much more common than uniform shifts up or down (see Chapter 3 of van Deventer, Imai, and Mesler Advanced Financial Risk Management 2nd edition for a count of the number of days for each type of shift). To use the equity analytics analogy, banks are using the interest rate equivalent of the 1factor 1965 capital asset pricing model in the 21st century were the data clearly shows that 20 to 40 factors drive equity returns. How many factors drive interest rates? We turn to that question now.
How Many Interest Rate Risk Factors are Necessary for Accuracy?
For many bankers, the accuracy of the modeling effort for interest rate risk is not a concern. Many bankers, parodied by this quote from John Maynard Keynes, just want to use the same interest rate risk techniques as their peers:
“A sound banker, alas, is not one who foresees danger and avoids it, but one who, when he is ruined, is ruined in a conventional way along with his fellows, so that no one can really blame him.” Quoted by Michael Pomerleano and Andrew Sheng, “A Failure of Public Financial Sector Governance,” Financial Times Economists Forum, January 26, 2010. Original quotation from “The Consequences to the Banks of the Collapse of Money Values”, 1931.
Another set of bankers apply a slightly more practical objective. They seek to invest the minimum in interest rate risk management that the regulators allow. Until recently, this was a very low standard. When asked why regulators did not fail banks in the examination process for inaccurate interest rate risk management, as recently as 2008 a U.S. bank regulator asked the author, “Do you really think we could do that?”
The period of regulatory “benign neglect” of inaccurate interest rate modeling efforts is coming to a rapid end. The Federal Reserve added a third interest rate factor to the U.S. Treasury yield curve in its 2014 Comprehensive Capital Analysis and Review program, effectively employing a three factor interest rate model. As mentioned above, the Bank for International Settlements mandates at least 6 interest rate risk factors in its December 2010 market risk framework (see paragraph (b) on page 12). For bankers who care about best practice and accuracy in risk modeling, there is great concern that their institutions are burdened with legacy interest rate risk systems that use one factor term structure models that are more than 20 years old. Two factor models are extremely rare among legacy interest rate risk systems. We now provide a roadmap to solve this problem.
Questions we seek to answer
The first step in moving forward in interest rate modeling as banks, insurance firms and regulators have done in credit risk modeling is to ask a set of related questions:

How do you measure the accuracy of an interest rate risk simulation technique?

Given that measure of accuracy, how many risk factors are necessary?

How does accuracy change as the number of factors increases?
Recent academic work in term structure modeling provides guidance in answering all three of these questions. We highly recommend this survey paper by Robert Jarrow “The Term Structure of Interest Rates,” Annual Review of Financial Economics, 1, 2009, pp. 6996. In the paper, Prof. Jarrow asks and answers our second question:
“How many factors are needed in the term structure evolution? One or two factors are commonly used, but the evidence suggests three or four are needed to accurately price exotic interest rate derivatives.”
Professor Jarrow, in his classes on interest rate modeling, emphasizes that the most important test of the accuracy of an interest rate simulation technique is the ability to accurately simulate the actual history of the interest rate market being simulated. A consistent and accurate reproduction of history is a necessary condition for interest rate model accuracy, along with the accurate pricing of all relevant observable interestrate related securities.
Another important survey paper is by Darrell Duffie and Rui Kan, “Multifactor Term Structure Models” (from Philosophical Transactions: Physical Sciences and Engineering, Volume 347, Number 1684, Mathematical Models in Finance, June 1994, p. 577586). The authors note on pages 580581 that
“Although singlefactor models offer tractability, there is compelling reason to believe that a single state variable, such as the short rate r_{t}, is insufficient to capture reasonably well the distribution of future yield curve changes. The econometric evidence in favour of this view includes the work of Litterman & Scheinkman (1988), Stambaugh (1988), Pearson & Sun (1990), and Chen & Scott (1992 b, 1993).”
In order to keep multifactor models of the term structure from being too complex from a mathematical point of view, it is very common for researchers to follow this term structure model derivation process:

Assume that a given number k of risk factors drive interest rates

Assume a stochastic process by which those factors evolve

Derive the shape of the yield curve and its movements
A recent example of such an approach is by Federal Reserve researchers Don H. Kim and Athanasios Orphanides: “Term Structure Estimation with Survey Data on Interest Rate Forecasts,” Journal of Quantitative and Financial Analysis (2012). An earlier version of the paper by Kim and Orphanides is also available here: Board of Governors of the Federal Reserve System October 2005
The Heath, Jarrow, and Morton Approach
In a series of papers originally written in the late 1980s, Heath, Jarrow, and Morton describe a much different analytical process for interest rate modeling that is designed to achieve both of the necessary conditions for interest rate modeling accuracy that we noted above:
The Heath, Jarrow and Morton approach is perfectly suited to this task because the authors take the current shape of the yield curve as a given, instead of deriving what the shape of the yield curve must be given a set of mathematical assumptions. The Heath, Jarrow and Morton process for interest rate modeling is as follows:

Take the current yield curve as given

Assume a volatility structure for interest rates, including the number of factors and the nature of those factors’ movements

Constrain the drift in yield curve so that no arbitrage is possible

Given those constraints, derive the movements in the yield curve
For an excellent introduction to the Heath, Jarrow, and Morton approach, see Jarrow and Chatterjea (2013) and the worked 1, 2, and 3 factor examples in chapters 6 through 9 of van Deventer, Imai and Mesler. For the mathematically confident, the original papers will someday earn the authors the Nobel Prize in Economic Science:
Heath, David, Robert A. Jarrow and Andrew Morton, "Bond Pricing and the Term Structure of Interest Rates: A Discrete Time Approach," Journal of Financial and Quantitative Analysis,1990, pp. 419440.
Heath, David, Robert A. Jarrow and Andrew Morton, "Contingent Claims Valuation with a Random Evolution of Interest Rates," The Review of Futures Markets, 9 (1), 1990, pp.54 76.
Heath, David, Robert A. Jarrow and Andrew Morton, ”Bond Pricing and the Term Structure of Interest Rates: A New Methodology for Contingent Claim Valuation,” Econometrica, 60(1),1992, pp. 77105.
Heath, David, Robert A. Jarrow and Andrew Morton, "Easier Done than Said", RISK Magazine, October, 1992.
Using the Heath, Jarrow and Morton framework, we now turn to practical application using the example of the U.S. Treasury market. The same approach could be used in any market around the world with approach modifications that have more to do with data availability than geography or market practices.
How Many Interest Rate Risk Factors are Necessary?
The Example of U.S. Treasury Data
One of the key uses of risk management simulation is a valueatrisk quantification for the mark to market of the firm’s balance sheet (the “economic value of equity” or “EVE”) from an interest rate risk perspective. What percentile of tail risk is common among U.S. financial institutions? The range one sees in practice falls between 99.0 and 99.9. The accuracy with which one can measure value at risk at these percentile levels depends on two key analytical decisions: the interest rate risk model’s accuracy and the number of scenarios generated. Clearly, an interest rate risk model that is only 90% accurate will generate the appropriate 99^{th} percentile valueatrisk only by accident. That is a not a good basis for measuring the safety and soundness of a financial institution nor is it a good foundation for a long career in risk management. In the rest of this note, we focus only on two related issues:
How does the accuracy of interest rate modeling change as the number of factors increases, and how many factors should we use?
In subsequent notes, we add accuracy measurements for the assumptions about how the risk factors change once we have determined which factors are important.
In this section, the source of historical data is the Board of Governors of the Federal Reserve H15 Statistical Release, which reports data collected by the U.S. Department of the Treasury. In the introduction, we opened with a pictorial history of 52 years of daily U.S. Treasury yield movements. We reproduce the history of the 10 year Treasury yield in this graph to again emphasize that the current low and stable rates in the U.S. Treasury markets are not typical and not likely to persist for long:
Kamakura Corporation provides videos of 50 years of daily U.S. Treasury yield movements that may be of interest:
Our objectives for the rest of this note are simple to state:

For 1, 2, 3, 6, and 9 factors, how accurate is our interest rate risk simulation?

Which factors should be used?

How many factors should be used?
Naïve Model 1: Principal Components Analysis of
All 11 Current Yields Reported by the Federal Reserve
We start with what should be every analyst’s first step in determining how many risk factors drive interest rates. This first step is fraught with danger and potential pitfalls, as we illustrate with a few examples using principal components analysis. Principal components analysis takes either the variancecovariance matrix (preferred) for n random variables or their correlation matrix (second choice) and produces a list of 1 through n uncorrelated risk factors that produce both the observed volatility and correlation of the n random variables.
Important: Like any mathematical problem with n equations and n unknowns, n factors will in general be necessary for a solution that explains 100% of volatility. Take the current U.S. Treasury yields as an example. The Federal Reserve reports 11 different maturities of U.S. Treasury yields. If one wants to explain the quarterly changes in yields over k quarters, the answer will in general be this: we need 11 risk factors, and each risk factor will take on a different value in each of the k quarters. We know before we start that our answer is likely to be this “We need 11 factors to explain movements in the yield curve because there are 11 observable points on the yield curve.” Only if we are very lucky or if the world is much simpler than it appears will we find that the necessary number of factors is less than the observable points on the curve.
Principal components analysis (“PCA”) is a very good way to start this process, but it’s not enough. The strengths and weaknesses of principal components analysis are easy to summarize:
Strengths

PCA is widely available in inexpensive statistical packages that are easy to use.

For a problem with n random variables, it will produce 1, 2, 3, and n factor measures of accuracy

It will identify the risk factors in order from most important to least important, identified in the charts below as “component 1,” “component 2” and so on.

The components are derived to be uncorrelated, which makes for highly efficient simulation
Weaknesses

PCA doesn’t tell you what the factors are. They will not be directly associable with any of the n variables used as input to the PCA calculation.

PCA doesn’t warn the user of this, but it will only use the historical periods for which all n variables have data. The fact that n1 variables may have a 52 year history and the nth variable has a 1 year history will result in a PCA analysis based on a 1 year variancecovariance calculation.

If you split the data in half, and then estimate a PCA model independently on both halves, the factors will in general be different. This implies that going forward in time (the past is the first half, and the future is the second half) may require different factors than history indicated.
The fact that the components are “unknown,” surprisingly, isn’t a problem for hedging of rate risk, for valuation, or for value at risk. Instead, it’s largely a political problem for management and regulators who expect an answer different from “I don’t know” when asked what factors drive interest rates. Sadly, if a thoughtful and correct answer takes 60 seconds, one often isn’t given time for such an answer. We ignore the politics of PCA and plunge into the simple examples.
Here are the principal components analysis results that come from inputting all 11 maturities available on the Federal Reserve website into a wellknown statistical package. The U.S. Treasury maturities loaded are 1 month, 3 months, 6 months, 1, 2, 3, 5, 7, 10, 20 and 30 years, a total of 11 maturities.
The analysis shows that the first factor, “component 1,” explains 90.23% of total variation among the 11 Treasury maturities. The second factor brings the explanatory power to 99.39%. A few impatient readers may be thinking “that’s enough for me,” but we’ll explain why that conclusion is wrong in the rest of this note. The third factor brings total explanatory power to 99.80%, or so it seems. 9, 10, and 11 factors all provide 100.00% explanatory power, at least after rounding to two decimal places. Why is this not a high quality answer? We use the next example to show why not.
Naïve Model 2: Principal Components of Daily Changes in All 11 Current Yields Reported by the Federal Reserve
First, we note that it is almost always much harder in econometrics to model changes in economic variables than it is to model variation in their absolute levels directly. We now repeat the analysis but we analyze the daily changes in variables instead of their absolute levels. When we do this, we find that the explanatory power of the first component or factor drops to 63.65%, not the 90.23% found in the first naïve example. By the time we add a third factor, the explanatory power will be 91.52%, not the 99.80% in the first example. For an institution that wants a high degree of accuracy in its value at risk at the 99.5% percentile level for its “economic value of equity,” clearly 1, 2 and 3 factors leave the analyst in a very tenuous position from a job security (and accuracy) point of view.
Naïve Model 3: Principal Components of 4 Current Yields Reported by the Federal Reserve Which Have Longest History (1, 3, 5 and 10 years)
We now point out one of the subtle pitfalls of principal components analysis. In the first example, we had 2,152 historical observations on historical Treasury yields. In the second example, we had 1,674 observations. But the Federal Reserve data series starts in January 1962, we finally ask, how could we have so few observations? The answer is that the PCA calculation is done only on the common data points. The shortest data series of the 11 time series determines the modeling period, not the user (unless one is very careful). It turns out that only the 1, 3, 5, and 10 year Treasury yields are available back to 1962. We repeat the analysis using their common 13,024 observations, using the absolute level of rates.
This again gives use a misleading result, even though we are using a better data set. First, we know that 4 factors will give us a “perfect” fit since we are inputting only 4 maturities. By our own modeling decision, we have made it impossible to view any evidence for 5 or more factors. Again, as we found in naïve example 1, the first factor apparently has 98.29% accuracy. This could cause a naïve user to be comfortable with a 1 factor model, but that would be a horrible mistake.
Naïve Model 4: Principal Components of Daily Changes in 4 Current Yields Reported by the Federal Reserve Which Have Longest History (1, 3, 5 and 10 years)
We now analyze the daily changes of the 1, 3, 5, and 10 year Treasury yields. This brings the number of observations down to 10,111 because of weekends and holidays. The first factor has an explanatory power of 87.51%, and the second factor brings us to 96.00% cumulative accuracy. Three factors gives us 98.81% accuracy, and (with mathematical certainty) the fourth factor produces a perfect fit. A conclusion, even with this long data series, that 3 or 4 factors is enough is a false conclusion that results from a simple analysts’ mistake: we chose only 4 points on the yield curve without realizing how that would bias the answer.
Before we move to a best practice approach, let’s review the good news and the bad news from our naïve use of principal components analysis:
Good news from principal components analysis

We are now convinced that more than three factors are necessary to meet the typical target levels of accuracy

Principal components analysis derives uncorrelated risk factors, which are easiest to simulate and which are consistent with the Heath, Jarrow and Morton assumptions

Principal components analysis has provided quick and inexpensive guidance on the number of factors necessary
Bad news from principal components analysis

The analyst who uses principal components analysis has inadvertently destroyed potentially useful data by not using the full yield curve, only the points supplied by the Federal Reserve (or a subset of those points)

The default calculation in principal components analysis destroys accuracy by using only the fully overlapping data set, dropping all observations outside of the range of the variable with the smallest number of observations

Principal components analysis does not identify what the risk factors are. This is more of a political liability than an analytical problem, but it is a political liability of potentially fatal consequences for the analyst.
We now turn to a best practice approach.
Best Practice Approach
We now turn to an approach that is best practice. We start with one more principal components analysis exercise and then move to a better statistical procedure. We perform the following analytical steps:
 We smooth the yield curve using all observable data points for each date on which data is available.
 We extract forward returns with same periodicity as desired modeling periodicity. A forward return is one plus the uncompounded forward rate. If the modeling period is quarterly, we extract the quarterly forward rates and add one to get the forward returns.
 We drop the first observation after the Federal Reserve has made a change in the maturities reported on the H15 statistical release. We do this because the availability or lack of availability of a maturity point will change the shape of the smoothed yield curve.
 We extract orthogonal and identifiable risk factors and their histories using various points on the yield curve, in order from most important to least important.
 We then impose Heath, Jarrow and Morton no arbitrage constraints on the links between changes in forward returns and the risk factors
 For a number of risk factors from 1 to N, we derive the coefficients linking the risk factors to forward returns and measure accuracy.
From a mathematical perspective, this procedure dates back to AugustinLouis Cauchy (born in 1789). From a present day perspective, the process of orthogonalizing correlated variables into uncorrelated risk factors is known as the GramSchmidt procedure. We use a recent modified version of the GramSchmidt procedure in what follows.
Before reporting on this analysis, as a benchmark, we do a principal components analysis on quarterly forward returns from 1962 to September 30, 2013. We use quarterly intervals for consistency with the Federal Reserve CCAR 2014 intervals. Since Treasury yields are available to maturities of 30 years, we have one spot 3 month rate and 119 quarterly forward rates and forward returns. The results of the principal components analysis on the continuously compounded changes in these forward returns is shown here:
Contrary to the results of our naïve analysis, the analysis of each of the quarterly segments of the yield curve reveals the complexity of yield curve movements that is wellknown to traders of caps, floors and swaptions on the nowdiscredited Liborswap curve. The first factor explains only 49.10% of the movement of the 119 forward returns. This is a dramatic indictment of any analysis that purports to show the adequacy of a one factor model. The second factor brings accuracy to 87.79%. The third factor brings cumulative accuracy to 93.28%, still well short of the VAR tail percentiles that most American bankers have as a modeling objective. It takes 8 factors to jump over the 99.5% accuracy level that matches the median percentile target of the typical banking VAR exercise. We remind the reader that, even with a perfect interest rate model, there is sampling error in any valueatrisk calculation and that the number of scenarios to be run is a critical part of VAR design. No number of scenarios, however, will save a VAR calculation if the interest rate model has an accuracy of only 49.10%, 87.79%, or 93.28%.
We now turn to a more direct implementation of the Heath, Jarrow, and Morton approach.
A HeathJarrowMorton Implementation
In a modern Heath, Jarrow and Morton approach, the risk factors can be identified explicitly. They are then linked using standard econometric procedures to each quarterly segment (the forward returns) on the 30 year U.S. Treasury curve. We will discuss this in much more detail in later notes, but we can summarize the procedures here:
 Define the risk factors in order of importance
 Use a number of risk factors from 1 to N
 Fit the links between risk factors and each of the forward returns using the noarbitrage restrictions in Heath, Jarrow and Morton
 Derive the implied history of the risk factors
 Compile the coefficients from the econometric procedures for use in simulation
 Report the accuracy of the results
The order in which the risk factors are employed is important. We start by restricting ourselves to the maturities with the longest data history to ensure that we use all of the data available. Subject to those restrictions, we generally start with the shortest maturity unused risk factor, add the longest maturity unused risk factor, and then continue until all factors are employed. The last factors to be added are the 30 year and 20 year Treasury yields because they have the shortest data history. Factors were numbered as follows and defined as explained here; note that the forwards are assuming to have a term of one quarter:
Factor101, residuals of change in forward returns for forward maturing in month 6
Factor102, residuals of change in forward returns for forward maturing in year 10
Factor103, residuals of change in forward returns for forward maturing in year 3
Factor104, residuals of change in forward returns for forward maturing in year 7
Factor105, residuals of change in forward returns for forward maturing in year 5
Factor106, residuals of change in forward returns for forward maturing in year 1
Factor107, residuals of change in forward returns for forward maturing in year 2
Factor108, residuals of change in forward returns for forward maturing in year 30
Factor109, residuals of change in forward returns for forward maturing in year 20
When doing quarterly analysis, only 9 of the 11 maturities on the Federal Reserve H15 release are relevant. The 1 month Tbill rate is too short to use, and the change in the 3 month forward rate maturing in month 6 is calculated using the 3 month Tbill yield.
After fitting the 119 econometric relationships (calculation time is less than 30 seconds), we can derive the implied history of the 9 risk factors we are using. We confirm in this chart that the GramSchmidt procedure we followed produces uncorrelated risk factors (i.e. not statistically significant at the 5% significance level):
Correlation Among Risk Factors
Note: a * indicates a correlation statistically different from 0 at the 5% level
Next, we review the adjusted Rsquareds for each of the 119 statistical relationships. We report the accuracy here of a 1 factor model, a 2 factor model, a 3 factor model, a 6 factor model and a 9 factor model:
The improvement in accuracy from going from 1 factor to more factors is, frankly, astonishing. Equally astonishing is the high degree of risk that remains even after a one factor model has been employed. Looking at the adjusted Rsquared graph above, the blue line is the ability of the 1 factor model to explain the variation of each of the 119 forward returns in the 30 year Treasury yield curve. By the 80^{th} quarter (the 20 year point), the movement in factor 101 (the forward rate maturing in month 6) explains literally none of the movement in the long 10 years of the yield curve. This is not good news for major holders of mortgage servicing rights and 30 year mortgages.
The graph above shows that the remaining unexplained error is greatest in the middle of the gaps between the maturities of the risk factors used. The adjusted Rsquared of the two factor model (the red line) is lowest between the short rate and the 10 year point (the 40th quarter), the two risk factors used in the model. The explanatory power of the 9 factor model is near 100% at all maturities, but it deviates the most at maturities between the intervals on which the Fed reports Treasury yields: 1 month, 3 months, 6 months, 1, 2, 3, 5, 7, 10, 20 and 30 years.
The next graph, reported on a decimal basis, gives the root mean squared errors (i.e. the standard error of the regression) for each of the 119 forward returns and for 1, 2, 3, 6, and 9 factor models. This graph is particularly powerful in explaining the risk that remains after a bank employs a onefactor interest rate risk model.
The blue line, the root mean squared error for a one factor model, is about 0.0035 (0.35%) in standard deviation in forecasting or simulating the 30 year forward rate just one quarter ahead. This is an error of major import. Even at the 5 quarter point (15 months), the standard deviation around the forecasted forward return is almost 0.15% for just a one quarter forecast. The lowest green line, the 9 factor model, by contrast, has standard deviations that are generally 0.03% or less. The exception is a standard error of about 0.06% between one and two years, a maturity at which the Fed reports no data. This makes the 1 ½ year point a logical point to be the next factor added in the model.
One of the most powerful insights of this methodology is that, for every one of the 119 maturities that is not a risk factor, all or almost all 9 of the risk factors are statistically significant.
What have we accomplished?
What have we learned from this analysis?

We can answer the question “How good is good enough?” We can reject any term structure models which do not have enough factors to meet the accuracy targets set either by regulators or by the institution itself (for example, via a value at risk tail definition). This analysis shows how to do that measurement.

We can test for internal inconsistency: if the institution does not impose the Heath, Jarrow, and Morton no arbitrage constraints, a Monte Carlo simulation of a bond’s value will not equal the closed form solution of a bond’s value. This violates one of the two necessary conditions for acceptance of an interest rate risk model: the ability to price related securities accurately and an ability to accurately simulate historical yield curve movements.

We have shown how to test critical assumptions for accuracy. The results show clearly that one, two and three factor models are not accurate enough to meet normal risk management standards.

We have shown how to avoid an inadvertent failure to efficiently employ all available data

We have shown how a bank can use a stress testing accuracy standard, not just for credit risk, but also in its standard asset and liability management simulations, liquidity risk simulations, and capital adequacy simulations
Questions for Subsequent Analysis
We have intentionally left some important questions for later posts.
 Are parameters linking forward returns to risk factors constant? Yes or no?
 If not, how does the impact (the coefficient) of each of the risk factors change? With the level of rates? With time? With both?
The graph below plots the actual distribution of risk factor 9 (Factor 109) versus the normal distribution. The hypothesis of normality is not rejected for this factor by the standard tests. What about the other factors? We leave that for next time.
Author’s note: The author wishes to thank Prof. Robert A. Jarrow and Daniel Dickler for very helpful comments.
Donald R. van Deventer
Kamakura Corporation
Honolulu
March 5, 2014