A frequent question is whether the estimated coefficient is statistically different from zero. If the confidence interval does not include zero, then the coefficient is said to be statistically different from zero (with a specified level of confidence).
To test the hypothesis concerning the slope coefficient (e.g., to see whether the estimated slope is equal to a hypothesized value, say b0) we calculate a t-distributed statistic.
If the t-statistic is greater than the critical t-value for the appropriate df, (of less than the critical t-value for a negative slope) we can say that the slope coefficient is different from the hypothesized value, b1.
To test whether an independent variable explains the variation in the dependent variable, the hypothesis that is tested is whether the slope is zero:
H0: b1 = 0 versus the alternative (what you conclude if you reject the null, HA: b1 ≠ 0
o: Formulate a null and an alternative hypothesis about a population value of a regression coefficient and determine whether the null hypothesis is rejected at a given level of significance.
Example: Suppose the estimated slope coefficient is 0.78, the sample size is 26, the standard error of the coefficient is 0.32, and the level of significance is 5%. Is the slope different than zero?
The calculated test statistic is: tb = 0.78 - 0 / 0.32 = 2.4375.
The critical t-values are 2.060 (from the t-table with 24 df). Therefore, we reject the null hypothesis, concluding that the slope is different from zero. Note that if we had formed a confidence interval (i.e., 0.78 .32 * 2.060), zero would not have been included in the interval. The hypothesis test and the confidence interval will always lead to the same conclusion.
p: Interpret a regression coefficient.
Interpretation of coefficients:
The estimated intercept is interpreted as the value of the dependent variable (the Y) if the independent variable (the X) takes on a value of zero.
The estimated slope coefficient is interpreted as the change in the dependent variable for a given one-unit change in the independent variable.
Any conclusions regarding the important of an independent variable in explaining a dependent variable requires determining the statistical significance of the slope coefficient. Simply looking at the magnitude of the slope coefficient does not address the issue of the importance of the variable (i.e., you must perform a hypothesis test or create a confidence interval to assess the importance of the variable).
q: Calculate a predicted value for the dependent varialbe, given an estimated regression model and a value for the independent variable.
Forecasting using regression involves making predictions about the dependent variable based on average relationships observed in the estimated regression. Predicted values are values of the dependent variable based on the estimated regression coefficients and a prediction about the values of the independent variables. For a simple regression, the value of Y is predicted as:
Y = b0 + biXp
Where Y is the predicted value of the dependent variable and Xp is the predicted value of the independent variable (input).
Example: Suppose you estimate a regression model with the following parameters:
Y = 1.50 + 2.5 X1
In addition, you have forecasted the value of the independent variable to be 20 (i.e., X1 = 20). What is the forecasted value of the Y variable?
Y = 1.50 + 2.50(20) = 1.50 + 50 = 51.5
r: Calculate and interpret a confidence interval for the predicted value of a dependent variable.
Confidence intervals on the predicted value of a dependent variable are calculated in a manner similar to the confidence interval on the coefficient. The hard part about confidence intervals on the dependent variable is calculating the standard error of the forecast (sf). The equation is:
Y +/- tcsf
The standard error of the forecast, sf, is larger than the standard error of the regression, se. It's unlikely that you will have to calculate the standard error of the forecast. However, if you do need to calculate the variance of the forecast, the formula is:
sf2 = se2 [ (1) + (1 / n) + (X - X bar)2 / (n - 1)sx2 ]
Where se2 is the variance of the regression (i.e., SEE2) and sx2 is the variance of the independent variable.
Example: Suppose an analyst generates the following regression results:
Y = 0.01 + 1.2X
SEE = 0.23 (square to get se2), sx = 0.16 (square to get sx2), n = 32, and X bar = 0.06.
Calculate the value of the dependent variable given that the forecast value of X is 0.05. Calculate a confidence interval on the forecasted value. Use a significance level of 5%.
Y = 0.01 + 1.2(0.05) = 0.07
Using a 5% significance level, the critical t-value is 2.042 (t-table, 32-2 = 30 df). The variance of the forecast is:
sf2 = 0.0529 [ (1) + (1 / 32) + (0.05 - 0.06)2 / (32 - 1)(0.0256) ] = 0.05456.
The standard error of the forecast is √0.05456 or 0.23358. Hence, the prediction interval is:{-0.40697 < Y < 0.54697}
s: Describe the use of analysis of variance (ANOVA) in regression analysis. An ANOVA table is a summary of the explanation of the variation in the dependent variable and is included in the regression output of many statistical software packages. You can think of the ANOVA table as the source of the data for the computation of many of the concepts discussed in this summary. For instance, the data to compute the R2 and the standard error of the estimate (SEE) comes from the ANOVA table. There are many ways the data from this table can be used in the statistical inference process (most beyond the scope of the CFA curriculum).
t: Define and interpret an F-statistic.
The F-statistic is used in hypothesis testing. Though it can be used in simple regression, it is more often used to test hypotheses involving more than one independent variable. The most common application is for the test of the significance of the entire set of independent variables:
Ho: b1 = b2 = b3 ... bk = 0
HA: at least one beta different from zero
The F-statistic is used to test whether at least one independent variable in the set of independent variables explains a significant portion of the variation of the dependent variable. This is a goodness of fit test.
The F-statistic is a measure of how well the independent variables, as a group, explain the variation in the dependent variable. It is calculated with the following formula:
F = Mean square regression, MSR / Mean square error, MSE = {(SSR / k) / (SSE / n - k - 1)}
To determine whether an F-statistic is statistically significant, we compare the calculated F-statistic with the critical F-statistic for k(numerator) and n - k - 1 (denominator) degrees of freedom (n = number of observations, k = number of slope coefficients).
In a simple regression, the F-statistic is equal to the squared t-statistic of the slope coefficient. So, for regression with only one independent variable, the F-statistic is redundant.
The analysis of the F-statistic is similar to the t-statistic, except you use the F-table, and you need to worry about the degrees of freedom in both the numerator and denominator of the previous equation. The numerator df is the number of independent variables. The denominator df is [n - (number of independent variable + 1)] or n - k - 1.
u: Discuss the limitations of regression analysis.
Limitations of regression analysis:
Regression relations change over time. This is referred to as a non-stationarity.
If the assumptions of regression analysis are not valid, the interpretation and tests of hypotheses are not valid. For example, if the data is heteroskedastic (non-constant variance of the error terms) or exhibits autocorrelation (error terms are not independent), then it is very difficult to use the regression to forecast the dependent variable given information about the independent variables.
以上是關(guān)于金融分析師定量分析的相關(guān)試題內(nèi)容,希望對(duì)您能有所幫助。
|
|
||
|
|