Search My Blog
 About Donald

Don founded Kamakura Corporation in April 1990 and currently serves as its chairman and chief executive officer where he focuses on enterprise wide risk management and modern credit risk technology. His primary financial consulting and research interests involve the practical application of leading edge financial theory to solve critical financial risk management problems. Don was elected to the 50 member RISK Magazine Hall of Fame in 2002 for his work at Kamakura. Read More

 Connect
 Now Available

An Introduction to Derivative Securities, Financial Markets, and Risk ManagementAdvanced Financial Risk Management, 2nd ed.

 Contact Us
Kamakura Corporation
2222 Kalakaua Avenue

Suite 1400
Honolulu HI 96815

Phone: 808.791.9888
Fax: 808.791.9898
info@kamakuraco.com

Americas, Canada
James McKeon
Director of USA Business Solutions
Phone: 215.932.0312

Andrew Zippan
Director, North America (Canada)
Phone: 647.405.0895
 
Asia, Pacific
Clement Ooi
President, Asia Pacific Operations
Phone: +65.6818.6336

Australia, New Zealand
Andrew Cowton
Managing Director
Phone: +61.3.9563.6082

Europe, Middle East, Africa
Jim Moloney
Managing Director, EMEA
Phone: +49.17.33.430.184

Tokyo, Japan
3-6-7 Kita-Aoyama, Level 11
Minato-ku, Tokyo, 107-0061 Japan
Toshio Murate
Phone: +03.5778.7807

Visit Us
Linked In Twitter Seeking Alpha

Careers at Kamakura
Technical Business Consultant – ASPAC
Asia Pacific Region
Business Consultant – ASPAC
Asia Pacific Region

Consultant
Europe

Kamakura Risk Manager Data Expert
Europe, North America, Asia & Australia 

 

 Archive
  

Kamakura Blog

  
May 24

Written by: Donald van Deventer
5/24/2009 4:20 PM 

One of the most frequently asked questions when people review predictive models of default is this: “Aren’t those explanatory variables correlated, and doesn’t this create problems with multi-collinearity?” Since almost every default model has correlated explanatory variables, this is a question that comes up often. Since I am not an econometrician (although many of my colleagues are), this post collects quotes on this issue from nine popular econometrics texts (it was a 3 day weekend in the USA) to answer this question.

The texts that we consulted were the following popular (and intelligent) econometrics texts.  In the interests of full disclosure, we receive no commissions if any readers decide to buy them!

  • Campbell, John Y, Andrew W. Lo, and A. Craig McKinley, The Econometrics s of Financial Markets, Princeton University Press, 1997.
  • Goldberger, Arthur S.  A Course in Econometrics, Harvard University Press, 1991.
  • Hamilton, James D.  Times Series Analysis, Princeton University Press, 1994.
  • Johnston, J. Econometric Methods, McGraw-Hill, 1972
  • Maddala, G. S.  Introduction to Econometrics, third edition, John Wiley & Sons, 2005.
  • Stock, James H. and Mark W. Watson, Introduction to Econometrics, second edition, Pearson/Addison Wesley, 2007.
  • Studenmund, A. H.  Using Econometrics: A Practical Guide, Addison-Wesley Educational Publishers, 1997.
  • Theil, Henri.  Principles of Econometrics, John Wiley & Sons, 1971.
  • Woolridge, Jeffrey M.  Econometric Analysis of Cross Section and Panel Data, The MIT Press, 2002.

We’ve selected the following quotes on multi-collinearity from the texts above:

From Goldberger, page 246:

  • “The least squares estimate is still the minimum variance linear unbiased estimator, its standard error is still correct and the conventional confidence interval and hypothesis tests are still valid.”
  • “So the problem of multicollinearity when estimating a conditional expectation function in a multivariate population is quite parallel to the problem of small sample size when estimating the expectation of a univariate population.  But researchers faced with the latter problem do not usually dramatize the situation, as some appear to do when faced with multi-collinearity”

From Johnston, page 164

  • “If multicollinearity proves serious in the sense that estimated parameters have an unsatisfactorily low degree of precision, we are in the statistical position of not being able to make bricks without straw.  The remedy lies essentially in the acquisition, if possible, of new data or information, which will break the multicollinearity deadlock.”

From Maddala, page 267

  • “…Multicollinearity is one of the most misunderstood problems in multiple regression…there have been several measures for multicollinearity suggested in the literature (variance-inflation factors VIF, condition numbers, etc.).  This chapter argues that all these are useless and misleading.  They all depend on the correlation structure of the explanatory variables only…high inter-correlations among the explanatory variables are neither necessary nor sufficient to cause the multicollinearity problem.  The best indicators of the problem are the t-ratios of the individual coefficients. This chapter also discusses the solution offered for the multicollinearity problem, such as ridge regression, principal component regression, dropping of variables, and so on, and shows they are ad hoc and do not help.  The only solutions are to get more data or to seek prior information.”

Stock and Watson, page 249

  • “Imperfect multicollinearity means that two or more of the regressors are highly correlated, in the sense that there is a linear function of the regressors that is highly correlated with another regressor.  Imperfect multicollinearity does not pose any problems for the theory of the OLS estimators; indeed, a purpose of OLS is to sort out the independent influences of the various regressors when these regressors are potentially correlated.”

Studenmund, page 264

  • “The major consequences of multicollinearity are
  1. Estimates will remain unbiased…
  2. The variances and standard errors of the estimates will increase…
  3. The computed t-scores will fall…
  4. Estimates will become very sensitive to changes in specification…
  5. The overall fit of the equation and the estimation of non-multicollinear variables will be largely unaffected…”

Theil, page 154

  • “The situation of multi-collinearity (both extreme and near-extreme) implies for the analyst that he is asking more than his data are able to answer.”

Our experience in default modeling at Kamakura is that the amount of data overwhelms the number of potential explanatory variables, so multi-collinearity is almost never a problem.  We have more than 2 million observations and more than 2,000 defaults in our listed company model.  For mortgage models, there are more than 70,000,000 mortgages for which data is available in the United States, and there is no problem readily determining which variables are statistically significant.

Comments and questions are welcome at info@kamakuraco.com.  I reserve the right to involve my colleagues Professor Robert A. Jarrow, Professor Jens Hilscher, and Sean Klein, senior research fellow, in answering any hard questions.  For real time risk management commentary, follow Kamakura on twitter at www.twitter.com/dvandeventer.

Donald R. van Deventer
Kamakura Corporation
Honolulu, May 26, 2009

 

Tags: