Question: Part I:
Import and Validate Data Fill in the following table based on ‘Payroll Data_.xlsx’. For this question, you would need to use the tab based on the ‘payroll_data.txt’ data. Questions Your Answer How many rows are there? How many columns? What is the average salary? Identify and address two potential data quality issues for the ‘Payroll’ tab of the generated Excel file. Part II: Perform the Transformation of the Data Create a new column and calculate total compensation (salary plus overtime pay, treating those missing pay as zero overtime pay) in the ‘Payroll’ tab. What is the average total compensation? State any assumptions you make in the calculation. Add a column and calculate the employee’s tenure at the company in years as of June 30, 2020. Keep two decimal points.
Take a screenshot of the first 10 observations (the screenshot must include at least employee ID and tenure variables). Create a binary column (0/1) that identifies employees of the ‘Tasty’ operating units. Name the variable as dummy tasty. Take a screenshot of the first 10 observations (must include at least employee ID and the binary column). Add columns for Job Position and Gender in ‘Payroll’ tab from the ‘Reference Data’ tab.
Take a screenshot of the first 10 observations (must include at least employee ID, job position and gender), paste the screenshot below together with the formula you used in Excel for the calculation. Part III Conduct the Analysis What is the average percentage of overtime pay as of total compensation only for employees who work overtime (i.e., those with overtime pay larger than zero)? Create a pivot table with an average salary by Operating Unit and State and add conditional formatting. Create a linear regression model using tenure to predict total compensation. What is the adjusted R-squared value?
Comments