Multiple linear regression is a statistical method used to explore the relationship between a dependent variable and multiple independent variables. This model assumes that the dependent variable is a linear function of the independent variables. The underlying assumptions of multiple linear regression are important to ensure the validity and reliability of the results. In this answer, we will discuss the assumptions underlying multiple linear regression model in detail.
1. Linearity: The first assumption is that there exists a linear relationship between the dependent variable and the independent variables. This means that the slope of the regression line should be constant for all values of the independent variables. If the relationship is not linear, then the model may not be appropriate for the data.
2. Independence: The second assumption is that the observations are independent of each other. This means that the value of the dependent variable for one observation should not be affected by the value of the dependent variable for another observation. This assumption is important because if the observations are not independent, then the standard errors of the regression coefficients will be underestimated, leading to invalid inferences.
3. Homoscedasticity: The third assumption is that the errors or residuals of the regression should have constant variance across all levels of the independent variables. This means that the spread of the residuals should be the same for all values of the independent variables. If the errors are not homoscedastic, then the regression coefficients may not be reliable.
4. Normality: The fourth assumption is that the errors or residuals of the regression should be normally distributed. This means that the distribution of the residuals should be symmetric and bell-shaped. If the residuals are not normally distributed, then the p-values and confidence intervals may not be accurate, leading to incorrect inferences.
5. No multicollinearity: The fifth assumption is that there should be no perfect multicollinearity among the independent variables. This means that the independent variables should not be highly correlated with each other. If there is multicollinearity, then the regression coefficients may not be reliable, and the interpretation of the regression may be misleading.
6. No influential outliers: The sixth assumption is that there should be no influential outliers in the data. This means that there should not be any observations that have a disproportionate influence on the regression line. If there are influential outliers, then the regression coefficients may be biased, leading to incorrect inferences.
It is important to check these assumptions before performing multiple linear regression analysis. Violation of any of these assumptions can lead to biased results and incorrect inferences. There are several methods to check these assumptions, including residual plots, normal probability plots, and tests for multicollinearity. If any of these assumptions are violated, then appropriate remedial measures should be taken, such as transforming the variables, removing outliers, or selecting a different model.
Subscribe on YouTube - NotesWorld
For PDF copy of Solved Assignment
Any University Assignment Solution