The author wishes to thank Keith Luna of Western Asset Management Company for helpful comments that led to this new version.
In Part 10 of this series, we present the final installment in yield curve and forward rate smoothing techniques before moving on to smoothing credit spreads. We introduce the maximum smoothness forward rate technique introduced by Adams and van Deventer (1994) and corrected in van Deventer and Imai (1996), which we call Example H. We explain why a quartic function is needed to maximize smoothness of the forward rate function over the full length of the forward rate curve. The conclusion is similar to the fact that a curve constructed from twice differentiable yield curve segments produces a shorter length yield curve over the full length of the curve than a “curve” with linear segments, even though the shortest length between any two points is a linear function.
Finally, we compare 23 different techniques for smoothing yields and forward rates that have been discussed in this series and show why the maximum smoothness forward rate approach is the best technique by multiple criteria.
Sample Data for the Basic Building Blocks of Yield Curve Smoothing
We continue to insist in Part 10 of our series that any smoothing technique that does not fit the market exactly is unacceptable for practical use (such as the flawed Nelson-Siegel approach). We continue to fit this raw data with our derived “best” yield curve and associated forward rates.
Example H: Maximum Smoothness Forward Rates and Related Yields
As always in this series, “yields” are continuously compounded zero coupon bond yields and “forwards” are the continuous forward rates that are consistent with the yield curve. In Part 9 of our series, instead of deriving which yield or forward rate formulation was best by a mathematical criterion, we asserted that cubic forward rates would be better than the other techniques we’d tried before and promptly proved that assertion to be wrong. As shown below, cubic forward rates in various forms rank between 6th smoothest and 16th smoothest of the 23 techniques we list below. For that reason, we revert to the same process used through this series: we define our criterion for “best curve” and specify what constraints we impose on the “best” technique to fit our desired trade-off between simplicity and realism. We then answer the nine questions first posed in Part 2 of this series. We continue to insist that the curves be twice differentiable over the full length of the curve, but in addition, we add a new condition for a very special reason—we now also require that the forward rate curves be thrice differentiable. We explain why below. Having done this, we again use the insights of Janosi and Jarrow (2002) to optimize key parameters to derive the maximum smoothness and minimum length forward rate curves given our specification. Unlike cubic splines of forward rates, where the curves performed very differently depending on whether the criterion was “maximum smoothness” or “maximum tension/minimum length,” we find, using quartic splines of forward rates, that results are excellent both in terms of smoothness and length or tension. Here are the answers to our normal 9 questions for Example H, quartic splines of forward rates:
Step 1: Should the smoothed curves fit the observable data exactly?
1a. Yes
1b. No
1a. Yes. Our answer is unchanged. Since the Nelson-Siegel specification cannot meet this minimum standard, as outlined in earlier blogs in this series, it is not included in the comparisons below.
Step 2: Select the element of the yield curve and related curves for analysis
2a. Zero coupon yields
2b. Forward rates
2c. Continuous credit spreads
2d. Forward continuous credit spreads
2b. Forward rates is the choice for Example H. In addition to making this choice for empirically improving smoothness by changing parameter values, we add other “constraints” which improve our ability to maximize smoothness over the full length of the curve. We continue to observe that we would never choose 2a or 2b to smooth a curve where the underlying securities issuer is subject to default risk. In that case, we would make the choices in either 2c or 2d. We do that later in this series.
Step 3: Define “best curve” in explicit mathematical terms
3a. Maximum smoothness
3b. Minimum length of curve
3c. Hybrid approach
3a. Maximum smoothness. In Part 9 of this series, we were ambivalent about our criterion for “best,” reporting on results for “smoothest” as well as “minimum length” forward curves because we were (for the first and last time) simply asserting a technique was best without proof, much like advocates of the Nelson-Siegel approach. Here, in Part 10, we unambiguously choose maximum smoothness as the primary criterion for “best.” Within that class of functions that are maximum smoothness, we also take the opportunity to iterate parameter values to show which of the infinite number of “maximum smoothness” forward curves have “minimum length” and “maximum smoothness” for each set of constraints we impose. The two measures are once again very consistent, unlike the results we found for cubic forward rates in Part 9 of this series.
As noted in part 8 of this series, we have critiqued the results from some yield curve smoothing techniques because of the lack of smoothness in either yields, forward rates, or both. Smoothness is defined as the variable z such that
The smoothest possible function has the minimum z value. Since a straight line has a second derivative of zero, z is zero in that case and a straight line is perfectly smooth. In this case, the function g(s) is defined as the forward rate curve. In order to evaluate z over the full maturity spectrum of the forward curve, the forward rate segments must be at least twice differentiable at each point, including the knot points. We want to minimize z over the full length of the yield curve. Our answers in Steps 4-6 make the analytical valuation of z possible just as when we evaluated length in example D.
Here we call attention to the work of Adams and van Deventer (“Fitting Yield Curves and Forward Rate Curves with Maximum Smoothness.” Journal of Fixed Income, June 1994), as corrected in van Deventer and Imai (Financial Risk Analytics: A Term Structure Model Approach for Banking, Insurance, and Investment Management, Irwin Professional Publishing, Chicago, 1997). The maximum smoothness forward rate technique is also discussed in detail in van Deventer, Imai and Mesler (Advanced Financial Risk Management, John Wiley & Sons, 2004, translated into modern Chinese and published by China Renmin University Press, Beijing, 2007). We refer here to Chapter 8 of Advanced Financial Risk Management.
Recall the discussion of cubic splines that we first identified in Example F:
http://en.wikipedia.org/wiki/Spline_interpolation
http://en.wikipedia.org/wiki/Spline_%28mathematics%29
It says in the first link, “Amongst all twice continuously differentiable functions, clamped and natural cubic splines yield the least oscillation about the function [g] which is interpolated.” One can use this conclusion to prove that the smoothest yield curve that one can draw between yields is a series of curves that are cubic splines of yields. Similarly, the smoothest line which one can draw between zero coupon bond prices is a series or curves that are cubic splines of zero coupon bond prices.
Why, then, did we find in Part 9 that cubic splines of forward rates are not the smoothest forward rate curves we can draw of all twice differentiable functions? Note the chart below that shows the cubic spline of forward rates is at best, depending on the constraints used, the sixth best function we can draw. Why is this so? The reason is that our market data comes (equivalently) in the form of zero coupon yields or zero coupon bond prices. It does not come in the form of forward rates, which we cannot observe directly. Instead, forward rates are related to zero coupon yields and bond prices by the same four relationships that we have been using throughout this series:
Given that forward rates are in effect a “derivative” of zero yields and zero prices, Oldrich Vasicek uses the calculus of variations to prove that the smoothest forward rate function that can be derived is a thrice differentiable function, the quartic spline of forward rates. This proof, given in Appendix B of Chapter 8 of Advanced Financial Risk Management, was corrected from the original version thanks to helpful comments from Volf Frishling of the Commonwealth Bank of Australia and further improved by the insights of Kamakura’s Robert A. Jarrow. The choice of the quartic form is a derivation of the proof, shown on pages 162-163 of Advanced Financial Risk Management. Similarly, in order to achieve maximum smoothness, the same proof shows that the curves fitted together MUST be thrice differentiable over the entire length of the curve, which obviously includes the knot points. For that reason, we impose this constraint below, and find that it substantially improves the results that we got using cubic splines of forward rates in Part 9.
We found a similar result earlier in this series, where we discovered that curves which were “maximum tension/minimum length” between any two points on the curve (in terms of linear forward rates) were not “maximum tension/minimum length” over the full length of the forward curve. In fact linear forward rate curves that join at the knot points but are not differentiable there rank only 18th best of the 23 techniques we rank by length below. For the same reason, our restriction to cubic forward rate segments in Part 9, Example G, was too rigid to allow the forward rate curve to bend to maximum smoothness.
Now, thanks to Oldrich Vasicek, with help from Frishling and Jarrow, this problem has been solved and we apply it here in answering our standard 9 questions:
Step 4: Is the curve constrained to be continuous?
4a. Yes
4b. No
4b. Yes. As in Examples B and later, we insist on continuous forwards and see what this implies for yields.
Step 5: Is the curve differentiable?
5a. Yes
5b. No
5a. Yes. Again, this is the change first imposed in Example D. We seek to take the spikes out of forward rates by requiring that the first derivatives of two curve segments be equal at the knot point where they meet. This constraint also means that a linear “curve” isn’t sufficiently rich to satisfy the constraints we’ve imposed.
Step 6: Is the curve twice differentiable?
6a. Yes
6b. No
6a. Yes. As in part 8, where we applied this constraint to the yield segments, we again insist that the forward rate curve be twice differentiable everywhere along the curve, including the knot points. This would allow us to evaluate smoothness analytically if we wished to do so, and it insures that the forward rate curve segments will join in a visibly smooth way at the knot points.
Step 7: Is the curve thrice differentiable?
7a. Yes
7b. No
7b. Yes. For the first time, we impose this constraint because we have a mathematical proof that we will not obtain a unique maximum smoothness forward curve unless we do impose it. This was one of the many things we learned from Part 9 of our series on the cubic spline of forward rates, where the curve is not thrice differentiable at the knot points.
Step 8: At the spot date, time 0, is the curve constrained?
8a. Yes, the first derivative of the curve is set to zero or a non-zero value x.
8b. Yes, the second derivative of the curve is set to zero or a non-zero value y.
8c. No
8a and 8b. Both approaches will be used and compared. Like Example F in Part 8 of this series, we find it necessary to constrain the forward rate curve at its left hand side and right hand side in order to derive a unique set of coefficients for each segment of the forward rate curve. We compare results using both answers 8a and 8b and reach conclusions about which is “best” for the sample data we are analyzing. The answers to question 8 can have a critical impact on the reasonableness of splines for forward rates in a financial context. In fact, in this example, we need to use 3 of the four constraints listed in questions 8 and 9.
Step 9: At the longest maturity for which the curve is derived, time T, is the curve constrained?
9a. Yes, the first derivative of the curve is set to zero or a non-zero value j at time T.
9b. Yes, the second derivative of the curve is set to zero or a non-zero value k at time T.
9c. No
9a and 9b. Both approaches will be used and compared. For uniqueness of the parameters of the quartic forward rate segment coefficients, we again have to choose 3 of the four constraints in questions 8 and 9. We then optimize the parameters used in these constraints to achieve “the best” forward rate curve as suggested by Janosi and Jarrow (2002). The constraints here are again imposed on the forward rate curve, not the yield curve. We will use both “maximum smoothness” and “minimum length” as criterion for best, conditional on our choice of quartic forward curve segments and the other constraints we impose. Contrary to the results in Part 9, Example G, we are reassured in Part 10 to see that the maximum smoothness forward rate technique produces results that are both the smoothest of any twice differentiable functions and the shortest length/maximum tension over the full length of the curve.
Deriving the Parameters of the Quartic Forward Rate Curves Implied by Example H Assumptions
We know from the corrected proof in Adams and van Deventer (1994) that our assumptions imply a quartic forward rate curve that is thrice differentiable over the full length of the forward rate curve, including the knot points. Our data set has observable yield data at maturities of 0, 0.25 years, 1, 3, 5 and 10 years. We have 6 knot points in total and 4 interior knot points (0.25, 1, 3, and 5 years) where the curves that join must have equal first, second and third derivatives.
That means that we need to step up to a functional form for each forward rate curve segment that has more parameters than the quadratic segments we used in Examples D and E and the cubic splines used in Examples F and G. The following two links to Wikipedia articles nicely summarize the progression from linear to quadratic to cubic splines, as we mentioned in Example F:
http://en.wikipedia.org/wiki/Spline_interpolation
http://en.wikipedia.org/wiki/Spline_%28mathematics%29
As in Example F in Part 8 of this series, we can measure the smoothness of a forward rate curve in two ways. First, we could explicitly evaluate the integral that defines the smoothness statistic z since our constraints for Example H will result in a forward rate curve that is twice differentiable over its full length. The other alternative is to use a discrete approximation to evaluation of the integral, like we did in Examples F and G. This is attractive because it allows us to calculate smoothness for the other examples in this series for which the relevant functions are not twice differentiable. For that reason, we use the same discrete measure of smoothness at 1/12 year maturity intervals over the 120 months of the forward curve and yield curves that we derive here. That discrete approximation uses these facts:
For any function x, the first difference between points i and i-1 is
The first derivative can be approximated by
The second derivative is
To evaluate smoothness numerically, we calculate the second derivative at 1/12 year intervals, square it, and sum over the full length of the yield curve. With this background out of the way, we now derive the forward rate curve that is consistent with maximum smoothness, given the constraints that we have imposed.
Each forward rate curve segment has the quartic form
The subscript i refers to the segment number. The segment from 0 to 0.25 years is segment 1, the segment from 0.25 years to 1 year is segment 2, and so on. The first constraint requires the first forward rate curve segment to be equal to the observable value of y at time zero since y(0)=f(0).
where for this first constraint tj = 0. In addition we have four constraints that require the forward rate curves to be equal at the four interior knot points:
at the interior knot points. We rearrange these four constraints, for j=1,4 like this:
At each of these interior knot points, the first derivatives of the two forward rate segments that join at that point must be also be equal:
When we solve for the coefficients, we will rearrange these four constraints in this manner:
Next, we need to impose the constraint that the second derivatives of the joining forward rate curve segments are equal at each of the four interior knot points. This requires
As usual we rearrange this as follows:
Finally, the new constraint that the third derivatives be equal at each knot point requires that
We arrange this constraint to read as follows:
So far, we have 1+4+4+4+4=17 constraints to solve for 5×5=25 coefficients. We have 5 more constraints that are similar to those used in Parts 5, 7 and 9 of this series:
For our 23rd, 24th and 25th constraints, we can choose any 3 of the constraints in 8a, 8b, 9a, and 9b. We choose three combinations that are very common in financial applications where it is logical and realistic to expect the forward rate curve to be flat at the right hand side of the yield curve, f’(10)=0 in this example, where tj+1=10.
Our 24th and 25th constraints come from setting the second derivatives at time tj+1=0 to x1 on the left hand side of the curve and x2 on the right hand side of the curve where tj+1=10:
Our initial implementation, which we label Example H-Qf1a (quadratic forwards 1a), will use x1 and x2=0. Our second implementation Example H-Qf1b uses the insights of Janosi and Jarrow (2002) to optimize x1 and x2 to minimize the length of the forward curve. The third implementation optimizes x1 and x2 to maximize the smoothness of the curve by minimizing the function z given above on a discrete 1/12 of a year basis.
In matrix form, with apologies to those readers with less than perfect vision, our constraints for example H-Qf1a look like this:
Note that it is the last three elements of the “y Vector” matrix where we have set the constraints that the second derivatives at time zero and 10 and the first derivative of the yield curve at 10 years be zero. When we invert the coefficient matrix, we get the following result:
We solve for the following coefficient values:
These coefficients give us the five quartic functions that make up the forward rate curve. We use the second and third relationships below to derive the yield curve implied by the derived coefficients for the forward rate curve:
The yield function for any segment j is
We note that y* denotes the observable value of y at the left hand side of the line segment where the maturity is tj. Within the segment, y is a quintic function of t, divided by t.
We can now plot the yield and forward rate curves to see the realism of our assumptions about maximum smoothness and the related constraints we have imposed. The results show something very interesting. When the infinite number of forward rate curves that are maximum smoothness subject to constraints 23, 24 and 25 are compared, it is seen that the real trade-off comes from sacrificing smoothness at the left hand side of the curve in order to get shorter length overall, with a smaller rise in the forward curve at the 10 year point. Example H-Qf1a, where we have set all three assumptions 23, 24 and 25 to zero falls between the two optimized results at the 10 year point:
The results for all three examples are far superior to what we saw in Example G, where there were negative forward rates in some examples. The reason the results in Example G were so bad was that we made (intentionally) bad choices that many readers might have thought were good—just like linear forward rate segments, where we impose continuity and minimum length, we get bad results when we impose a cubic spline and maximum smoothness. The reason in both cases was that, even though we chose a sensible objective, we constrained the form of the line segments too severely to get reasonable results. The problem was the analyst’s choices, not the objective functions used. In Example H, that problem has done away.
When we compare the three yield curves from our three Example H alternatives, again the results are very reasonable:
Next we compare our base case, with assumptions 23, 24 and 25 all set to zero, with the arbitrary Nelson-Siegel formulation. The Nelson-Siegel functions are simply two lines drawn on a page with no validity because they don’t match the observable data, which are the black dots on the yield curve derived from our base case:
We continue to be surprised that the Nelson-Siegel technique is used in academia given the fact that it is both inaccurate and more complex in calculation: the Nelson-Siegel formulation is incapable of fitting an arbitrary set of data and requires a non-linear optimization. The maximum smoothness forward rate approach, by contrast, fits the data perfectly in every case and requires (in our base case) just a matrix inversion.
For example H-Qf1b, when we use simple spreadsheet optimize to iterate on x1 and x2 to derive the maximum smoothness forward rate curve of shortest length, we get these coefficients:
When we iterate using the Example H-Qf1c objective, maximum smoothness, we get the following set of coefficients for our five quartic forward rate segments:
Comparing Yield Curve and Forward Rate Smoothing Techniques
As we have discussed throughout this blog, the standard process for evaluating yield curve and forward rate curves should involve the mathematical statement of the objective of smoothing, the imposition of constraints that are believed to generate realistic results, and then the testing of the results to establish whether they are reasonable or not. In Part 11 of this blog, we discuss the “Shimko Test” used in Adams and van Deventer (1994) for measuring the accuracy of yield curve and forward rate smoothing techniques on large amounts of data. Today, we commit the same sin that is common among smoothing analysts with an affection for math over data—we analyze only one case, our base case, and reach some tentative conclusions whose validity can be confirmed or denied on huge masses of data using the approach in Part 11 of this series.
We start by stating what we know as a fact: that the maximum smoothness forward rate approach can be formally proven as the smoothest forward rate curve that can be drawn, conditional on the constraints imposed in the answers to questions 8 and 9. This insight, which is simply a fact, is due to Oldrich Vasicek with helpful comments from Volf Frishling and Robert A. Jarrow. This fact shouldn’t be overlooked for it’s a powerful one.
Similarly, the mathematical criterion for minimum length and the calculus of variations can be used to derive the functional form of the forward rate curve that produces minimum length, conditional on the constraints imposed. We leave this exercise to the reader.
In the next sections, we report the smoothness results and length results for 23 variations on the smoothing techniques in this series.
Ranking 23 Smoothing Techniques by Smoothness of the Forward Rate Curve
The chart below ranks 23 smoothing techniques on their effectiveness in lowering the discrete smoothing statistic as much as possible as explained above:
As we would expect, the best result was the maximum smoothness forward rate approach where the optimization of the 24th and 25th constraints was done with respect to smoothness. The second best approach was maximum smoothness was the base case, Example H-Qf1a, where we set the derivatives with respect to constraints 23, 24, and 25. Technique 17 was also one of the maximum smoothness alternatives, where we intentionally sacrificed the smoothness of the curve (most notably on the short end of the curve) to minimize length. Linear yield smoothing produced the worst results when smoothness is the criterion for best.
Ranking 23 Smoothing Techniques by Length of the Forward Curve
Next we report on the discrete approximation for length of the forward curve if its length is the sole criterion. The results for 23 techniques are shown below:
Not surprisingly, if one doesn’t insist on continuity or “smoothness” at all, one gets a short forward curve. Ranks 1-4 are taken by a yield/forward step function and quadratic curve fitting. Among all of the techniques which are at least twice differentiable, the maximum smoothness forward rate technique, optimized to minimize length, was the winner. There is no need to sacrifice one attribute (length) for another (smoothness). In this example, the same functional form (quadratic forward rates) can produce the winner by either criterion.
If we look at both attributes, which techniques best balance the trade-offs between smoothness and length? We turn to that question next.
Trading Off Smoothness versus the Length of the Forward Rate Curve
In the graph below, we plot the 23 smoothing techniques on an XY graph where both length and smoothness of the forward curve is relevant. We truncated the graph to eliminate some extreme outliers.
A reasonable person would argue that the “best” techniques by some balance of smoothness and length would fall in the lower left hand corner of the graph. There are five techniques with a forward rate curve with smoothness below 7,000 and length below 30.00:
We recognize that the cut-off points are completely arbitrary, and we present an alternative to this process in Part 11 in this series. Nonetheless it is interesting to see that the “best five” consist of 2 quartic forward rate smoothing approaches, ranked first and second by smoothness, and 3 cubic yield spline approaches. Of this “best 5” group, the quartic forward rate approach, optimized for smoothness, was the smoothest. The quartic spline approach, with all derivatives in constraints 23-25 set to zero, was the shortest.
How do these techniques compare when forward rates are plotted together? First, to the “gang of 5” we add the quartic spline approach where the optimization intentionally sacrifices smoothness for short length, even though the resulting smoothness statistic was outside the cut-off for the “gang of 5”:
The results are much like trying to judge the Miss Universe contest, where every entrant to the contest is attractive. Most observers would select the blue curve that flattens on the right hand side of the curve or the purple line slightly below it. These two techniques are the special case where the derivatives in constraints 23-25 of the quartic spline are set to zero (discussed in Adams and van Deventer, 1994) and the case where the quartic spline of forward rates is optimized for length. Again, the quartic spline produces the two most visually pleasing results from our small sample of 1 set of “market data.”
My good friend David Shimko responded to an early draft of the Adams and van Deventer paper by saying “I don’t care about a mathematical proof of ‘best,’ I want something that would have best estimated a data point that I intentionally leave out of the smoothing process-this to me is proof of which technique is most realistic.” A statistician would add, “And I want something that is most realistic on a very large sample of data.” We turn to how to do that in Part 11 of this series.
Donald R. van Deventer
Kamakura Corporation
Honolulu, January 5, 2010