# Dataset for Y, X, and Z that includes 90 observations

The following is your exam #2 given as a take home. It is due Monday December 14th. Please do not just write in the answers in this paper, but rather in a separate excel (preferred) or word document. Be clear and concise but provide enough material so that partial credit can be awarded if necessary. Good luck.

1. Suppose you have a dataset for Y, X, and Z that includes 90 observations. You decide to estimate the following model: How many degrees of freedom will you have for this estimation?

2. Suppose that you are studying the factors which explain a person’s salary. Your sample is drawn from workers between the ages of 15 and 90. You estimate the following regression: Where: Salary is measured in US dollars, Educ is the number of years of education, Female is a dummy variable equal to 1 for females, 0 otherwise, and Age is the person’s age a) What is the predicted salary of a 20 year old male with 10 years of education? b) Consider the Age variable. According to the estimated model, at what age do people earn maximum salaries? c) Consider the Educ variable. Suppose you decide to create three new dummy variables: Educ1 =1 for those with 0-12 years of education, 0 otherwise Educ2 =1 for those with 13-16 years of education, 0 otherwise Educ3 =1 for those with 17 or more years of education, 0 otherwise Then you estimate: Provide an interpretation for the Educ variable coefficient in Equation (1). Then, provide an interpretation for the education dummy variable coefficients in Equation (2). Finally, provide some intuition for how a researcher could choose between estimating Equation (1) and Equation (2). In other words, what assumptions regarding education are built into each of the models?

3. Consider the following two regression models (on the next page). There are data for four different industries (iron, rubber, stone, textile), for three different variables (ship, employ, overtime). The data was available for the years 1980 to 2000. a) Calculate the value of (a) in the Model 1 table. b) Calculate the value of (b) in the Model 1 table. c) Calculate the value of (c) in the Model 1 table. d) Calculate the value of (d) in the Model 1 table. e) Calculate the value of (e) in the Model 1 table. f) Find the value of (f) in the Model 2 table. g) The RSS (residual sum of squares) is called “Sum squared resid” in the output tables. For Model 1, calculate the TSS (total sum of squares). As long as you write out the formula that includes all of the relevant numbers (not just variable names), you do not need to make all of the calculations. h) Calculate the 90% confidence interval for STONE in Model 1. As long as you write out the formula that includes all of the relevant numbers, you do not need to make all of the calculations. The last page of the exam has a t-distribution table. i) Provide a brief description of how you would test to determine whether a pooled OLS regression is suitable, or whether it would be better to use a fixed effects model. j) What do you expect would happen if you try to estimate the following regression?