Скачать книгу

the part the model can explain, ESS, and the part the model cannot, RSS. These sums can be used to compute R2:

      (3.91)

      As promised, when there are no residual errors, when RSS is zero, R2 is one. Also, when ESS is zero, or when the variation in the errors is equal to TSS, R2 is zero. It turns out that for the univariate linear regression model, R2 is also equal to the correlation between X and Y squared. If X and Y are perfectly correlated, ρxy = 1, or perfectly negatively correlated, ρxy = –1, then R2 will equal one.

      Estimates of the regression parameters are just like the parameter estimates we examined earlier, and subject to hypothesis testing. In regression analysis, the most common null hypothesis is that the slope parameter, β, is zero. If β is zero, then the regression model does not explain any variation in the regressand.

      In finance, we often want to know if α is significantly different from zero, but for different reasons. In modern finance, alpha has become synonymous with the ability of a portfolio manager to generate excess returns. This is because, in a regression equation modeling the returns of a portfolio manager, after we remove all the randomness, ϵ, and the influence of the explanatory variable, X, if α is still positive, then it is suggested that the portfolio manager is producing positive excess returns, something that should be very difficult in efficient markets. Of course, it's not just enough that the α is positive; we require that the α be positive and statistically significant.

      Sample Problem

      Question:

      As a risk manager and expert on statistics, you are asked to evaluate the performance of a long/short equity portfolio manager. You are given 10 years of monthly return data. You regress the log returns of the portfolio manager against the log returns of a market index.

      Assume both series are normally distributed and homoscedastic. From this analysis, you obtain the following regression results:

      What can we say about the performance of the portfolio manager?

      Answer:

      The R2 for the regression is low. Only 8.11 percent of the variation in the portfolio manager's returns can be explained by the constant, beta, and variation in the market. The rest is idiosyncratic risk, and is unexplained by the model.

      That said, both the constant and the beta seem to be statistically significant (i.e., they are statistically different from zero). We can get the t-statistic by dividing the value of the coefficient by its standard deviation. For the constant, we have:

      Similarly, for beta we have a t-statistic of 2.10. Using a statistical package, we calculate the corresponding probability associated with each t-statistic. This should be a two-tailed test with 118 degrees of freedom (10 years × 12 months per year – 2 parameters). We can reject the hypothesis that the constant and slope are zero at the 2 percent level and 4 percent level, respectively. In other words, there seems to be a significant market component to the fund manager's return, but the manager is also generating statistically significant excess returns.

      Linear Regression (Multivariate)

      Univariate regression models are extremely common in finance and risk management, but sometimes we require a slightly more complicated model. In these cases, we might use a multivariate regression model. The basic idea is the same, but instead of one regressand and one regressor, we have one regressand and multiple regressors. Our basic equation will look something like:

      (3.92)

      Notice that rather than denoting the first constant with α, we chose to go with β1. This is the more common convention in multivariate regression. To make the equation even more regular, we can assume that there is an X1, which, unlike the other X's, is constant and always equal to one. This convention allows us to easily express a set of observations in matrix form. For t observations and n regressands, we could write:

      (3.93)

      where the first column of the X matrix —x11, x21, … , xt1– is understood to consist entirely of ones. The entire equation can be written more succinctly as:

      (3.94)

      where, as before, we have used bold letters to denote matrices.

      MULTICOLLINEARITY

      In order to determine the parameters of the multivariate regression, we again turn to our OLS assumptions. In the multivariate case, the assumptions are the same as before, but with one addition. In the multivariate case, we require that all of the independent variables be linearly independent of each other. We say that the independent variables must lack multicollinearity:

      (A7) The independent variables have no multicollinearity.

      To say that the independent variables lack multicollinearity means that it is impossible to express one of the independent variables as a linear combination of the others.

      This additional assumption is required to remove ambiguity. To see why this is the case, imagine that we attempt a regression with two independent variables where the second independent variable, X3, can be expressed as a linear function of the first independent variable, X2:

(3.95)

If we substitute the second line of Equation 3.95 into the first, we get:

      (3.96)

      In the second line, we have simplified by introducing new constants and a new error term. We have replaced (β1 + β3λ1) with β4, replaced (β2 + β3λ2) with β5, and replaced (β3ϵ2 + ϵ1) with ϵ3. β5 can be uniquely determined in a univariate regression, but there is an infinite number of combinations of β2, β3, and λ2 that we could choose to equal β5. If β5 = 10, any of the following combinations would work:

      (3.97)

      This is why we say that β2 and β3 are ambiguous in the initial equation.

      Even in the presence of multicollinearity, the regression model still works in a sense. In the preceding example, even though β2 and β3 are ambiguous, any combination where (β2 + β3λ2) equals β5 will produce the same value of Y for a given set of X's. If our only objective is to predict Y, then the regression model still works. The problem is that the value of the parameters will be unstable. A slightly different data set can cause wild swings in the value of the parameter estimates, and may even flip the signs of the parameters. A variable that we expect to be positively correlated with the regressand may end up with a large negative beta. This makes interpreting the model difficult. Parameter instability is often a sign of multicollinearity.

      There is no well-accepted procedure for dealing with multicollinearity. The easiest course of action is often simply to eliminate a variable from the regression. While easy, this is hardly satisfactory.

      Another

Скачать книгу