Multi collinearity In Regression

Devaraj Essampally
2 min readSep 28, 2020

--

In regression, “multi collinearity” refers to independent variables are correlated each other. Multi collinearity occurs when your model includes multiple factors that are correlated not just to your response variable, but also to each other.

In Pearson co-relation co-efficient 90% of independent variables data having relationship with two features. If we take those variable it may affect or not to the model. That time will facing multi collinearity problem.

How to check the Multi collinearity?

For multiple regression model to predict the best fit line using Ordinal least square technique, draw some statistical model we find out the multi collinear.

Model. summary () function will give a table graph, and it shows some parameters like coefficient, standard error, p-value, R2 and adjusted R2.

Coefficient — it shows the independent variable are how coefficient

R2 and adjusted R2 — both are parameters values are 0 to 1

Standard error — if the model does not follow the multi collinearity these values are very less

If the model having multi collinearity (means independent variable having much correlation that is more then 90%) these values are very high.

p-value: these values are always should p <0.5, if the value is Grete then P>0.5 those variables are suffering multi collinear problem.

How to solve?

Jest remove high correlated independent variables.

Perform an analysis designed for highly correlated variables, such as principal components analysis or partial least squares regression

--

--

Devaraj Essampally
Devaraj Essampally

Written by Devaraj Essampally

Data Scientist | My areas of interest are: Startups, Innovation, Machine Learning, Deep Learning, US Healthcare Analytics.