At the heart of model validation is a two-pronged approach:
- a vetting of model assumptions in order to avoid, whenever possible, assumptions known to be false, and
- a full Monte Carlo simulation which should prove that the assumptions used for stress-testing, VAR, and credit VAR are “no arbitrage” in a very specific sense. The valuation of all assets and liabilities with an observable price in either the primary or secondary markets should match observable market prices at time zero.
Without these two-fold validation steps, no manipulation of stress tests, value at risk, or credit-adjusted value at risk can produce results that are both correct and defensible at the transaction level. In this note, we concentrate on the first of these two points with respect to the historical data released by the Federal Reserve in conjunction with the CCAR 2015 stress tests. The same procedures can be used for any set of historical data that is used to benchmark the parameters of any other stress test, VAR calculation, or CCAR calculation.
Background on CCAR 2015 Calculations
The Federal Reserve documents released on Friday include the Comprehensive Capital Analysis and Review 2015 Summary Instructions and Guidance and Amendments to the Capital Plan and Stress Test Rules. While the Fed’s announcements are critical to large bank holding companies like Bank of America (BAC), Citigroup (C), JPMorgan Chase & Co. (JPM), and Wells Fargo & Co. (WFC), the accuracy of bank risk models and the need for model validation is world-wide. We now turn to a series of tests that can detect common errors in model assumptions and simulations with a minimum of incremental effort.
An Overview of the Model Validation Regime
As noted by Prof. Robert Jarrow in a recent article, there are no perfect models. The question for model validation is two-fold: is the model “good enough” to provide value, and is there an alternative modeling approach that is better?
There are four steps in the model validation process:
- A check of the accuracy of the assumptions of the model. Since there are no perfect models, we do not insist on perfection. Nonetheless, we have no tolerance for fatal inaccuracy either.
- A check of the implications of the model given its assumptions.
- A check of econometric procedures used to parameterize the model
- A check of the no arbitrage properties of the model. Does it accurately value traded securities with observable prices at time zero?
Many of the common errors in stress-testing, VAR and credit-adjusted VAR (“CVAR”) stem from very predictable human tendencies that can be considered issues of behavioral economics. We list a few tendencies here that raise red flags that call for intense scrutiny:
- To use the historical data set provided by regulators for modeling with no additions. For example, the Fed data set includes foreign exchange rates but no interest rate variables for the foreign countries. No arbitrage FX rates cannot be modeled without knowing local interest rates, as we discuss below. We need to supplement the Fed data set to ensure no arbitrage.
- To use the data field (say the 30 year fixed rate mortgage yield) provided by regulators without consideration for the proper no arbitrage specification for linking mortgage yields to macro factor movements. We discussed that issue in our October 20 note.
- To assume the historical variables are normally or lognormally distributed with a constant mean and variance.
- To assume that the historical variables are independently and identically distributed over time.
- To use the phrase “that’s not intuitive to me” to override a valid model. We address this point on November 14, 2014.
We now focus on Federal Reserve CCAR 2015 examples.
A Check on the Accuracy of Model Assumptions
In this and subsequent sections, we use historical data provided by the Federal Reserve as part of the 2014 and 2015 CCAR stress testing exercises for illustration purposes. The code names and descriptions of the 28 underlying CCAR 2015 macro factors are defined briefly here and in full in the Federal Reserve October, 2014 information at the links above:
We start with a simple question, applied to the full time series spanned by the historical data released by the Federal Reserve in its 2014 and 2015 CCAR documentation. Are any of the variables normally distributed, either in terms of absolute level, quarterly changes or quarterly percentage changes? We know that normality can be an elusive goal, especially for small data sets. How far from this goal are we? We use four statistical tests to answer this question: the Shapiro Wilk test, theShapiro Francia test, theskewness and kurtosis test, and the Kolmogorov-Smirnov test.
When we apply these four tests to the absolute level of the 28 CCAR variables, we find the following:
Normality hypothesis rejected by all 4 statistical tests
The hypothesis of normality is rejected by all four statistical tests for these 11 CCAR 2015 variables:
- jpyusd
- usdgbp
- nomgdpann
- uscpiann
- dowjones
- VIX
- homepriceindex
- creindex
- jpncpiann
- ukrealgdpann
- ukcpiann
Normality hypothesis rejected by 3 of the 4 statistical tests
The hypothesis of normality is rejected by 3 of the 4 statistical tests for these 11 CCAR variables:
- asiafx
- ust3moann
- usbbbann
- usmortann
- primerate
- realgdpann
- realdisincann
- nomdisincann
- usunemp
- eurorealgdp
- jpnrealgdpann
Generally speaking, it is the Kolmogov-Smirnov test that is the most likely of the four tests to be the test for which the normality assumption is not rejected.
Normality hypothesis rejected by 1 or 2 of the 4 statistical tests
The hypothesis of normality is not rejected by 1 or 2 of the 4 statistical tests for these three variables:
- usdeuro
- ust5yrann
- asiarealgdpann
Normality hypothesis is not rejected by any of the 4 statistical tests
The hypothesis of normality was not rejected by any of the 4 statistical tests for only 2 of the 28 CCAR variables, and they happen to be variables with the shortest time series for testing:
The results of the tests for all 28 variables are summarized in this table:
The probability values (“p-values”) for each test are given here:
Next, we check to see whether it is correct to assume that the changes in the values of variables are normally distributed. We apply the same four tests with these results:
In this case, the hypothesis of normally distributed quarterly changes is rejected by the four statistical tests as follows:
- Rejected by all 4 tests: 14 variables
- Rejected by 3 of 4 tests: 9 variables
- Rejected by 1 or 2 of the 4 tests: 3 variables
- Not rejected by any of the 4 tests: 2 variables
Next, we examine the hypothesis that the quarterly percentage changes are normally distributed, that is, that the variable itself has a lognormal distribution. We present the results in this table:
In this third case, the hypothesis of normally distributed quarterly changes is rejected by the four statistical tests as follows:
- Rejected by all 4 tests: 10 variables
- Rejected by 3 of 4 tests: 10 variables
- Rejected by 1 or 2 of the 4 tests: 3 variables
- Not rejected by any of the 4 tests: 5 variables
Tests for Independent and Identically Distributed Variables
Along with the hypothesis of some form of normality, there is often an overpowering tendency to assume that the variables are independently and identically distributed. This assumption implies that there should be no autocorrelation in the residuals from period to period and that the assumption of a constant variance (i.e. no heteroskedasticity) is valid.
We can test whether these features are true or not using standard tests for autocorrelation and heteroskedasticity. We present the results for autocorrelation in the levels of the variables using the Breusch-Godfrey test. We also present the test results for the Breusch-Pagan and Cook-Weisberg tests for heteroskedasticity in the levels of the 28 CCAR 2014 variables.
The results show that the hypothesis of constant variance (no heteroskedasticity) cannot be rejected in only 6 of the 28 cases. The hypothesis of no autocorrelation is rejected 27 of 28 times.
We now perform the same tests on the changes in the values of the variables. The results are shown here:
The results show that the hypothesis of constant variance (no heteroskedasticity) cannot be rejected in 11 of the 28 cases when analyzing the quarterly change in the variables. The hypothesis of no autocorrelation is rejected 22 of 28 times.
We now perform the same tests on the percentage changes in the values of the variables. The results are shown in this chart:
The results show that the hypothesis of constant variance (no heteroskedasticity) cannot be rejected in 15 of the 28 cases when analyzing the percentage change. The hypothesis of no autocorrelation is rejected 22 of 28 times.
Testing the Implications of Model Assumptions
The model validation exercises that have been done so far require no software other than a good stat package, and yet they have provided a long check list of false assumptions. We now turn to another area that yields a rich check list of model validation: the implications of model assumptions.
It is often the case that, when a user chooses a set of assumptions that produce unreasonable outcomes, the user blames the software that reveals the bad outcomes instead of himself. A proper model validation effort will reveal bad outcomes before bad assumptions are fed into a sophisticated risk management system. One important check on model implications is to alert the analyst when a set of assumptions produces values that are “unreasonable” by an objective, quantitative criterion. The easiest such cases are outcomes that are literally impossible (like a negative unemployment rate, which could happen if the unemployment rate is assumed to be normally distributed). Another set of useful criterion that are “unreasonable” are outcomes that exceed the highest and lowest values ever recorded with a very large probability.
To illustrate this procedure, we test the probability that the user-selected probability distributions fall outside of historical minimums and maximums for time horizons from 1 to 30 years using quarterly time periods. We take advantage of the formulas in the appendix to calculate the means and standard deviations, given the user’s assumptions, of the either the level, the change, or the percentage change of the macro-economic variables at each time horizon. Here are the results:
Consider the U.S. Dollar/Euro exchange rate, variable 4. The user has chosen to model this as a normally distributed variable. Unfortunately, given the parameters selected, a Monte Carlo simulation of this variable will produce simulations that are outside of the historical maximum and minimum 23.70% of the time in a one year simulation and 65.24% in a 30 year simulation. These probabilities are career-threatening. They are the fault of the user, not the software which delivered the bad news about the user’s choices.
The formulas in the appendix make this model validation check nearly painless.
Conclusions
This note has covered an updated check list of common errors to be avoided when conducting a calculations for a stress test, value at risk, or credit-adjusted value at risk. We use the historical data provided by the Federal Reserve to banks subject to the CCAR 2014 and 2015 stress tests. We have shown in the previous pages that there is a long list of well-intended but erroneous modeling choices that are made frequently. By avoiding these common errors, institutions subject to both regulatory and managerial stress tests can avoid the ultimate risk management nightmare: running a perfect hedge on the wrong number.
Appendix: Useful Formulas for Normal and Lognormal Distributions in Simulations
Lognormal distribution of returns
Consider a security with value S(0) at time zero and value S(k) after a forward-looking simulation of k periods. One common assumption about the distribution of securities returns is that they are lognormal with a mean, volatility, and correlation derived from historical data. If the m return assumed for the period is a constant m and the standard deviation of returns during the period is σ, the simulated value of the security’s value after k periods is
The variable r(m,σ,i) is the random, simulated return on the security in the ith period. The mean return is of course m and the periodic volatility is σ ^{2}. We can solve for the probability distribution of S(k) by taking the log of both sides of the expression above and simplifying using the properties of the normally distributed returns, which we assume are independent from period to period.
Rearranging and taking expectations gives us the mean continuously compounded return over the k periods:
The variance of the continuously compounded return over the k periods is
And the standard deviation of the continuously compounded return over k periods is
Normally distributed variables in a simulation
What if another variable X is normally distributed from period to period with changes c(m,σ,i) in period i with a mean of m again and a standard deviation of σ? Then the value of X(k) after k periods is
Since the changes are again assumed to be independent, the mean, variance and standard deviations of the changes in X over k periods can be written as follows: