The correlation between %diabetes, % inactivity, and %obesity
For my project, the equation for multiple regression can be given as
Y =β0 +β1X1 +β2X2….
Y represents the percentage of individuals , with diabetes, while X1 denotes the percentage of people who are active , and X2 represents the percentage of individuals who are obese.
When attempting to determine the correlation between the percentage of individuals with diabetes (%diabetics) and a single variable, specifically the percentage of inactivity (%inactivity), we find that Pearson’s R-squared is approximately 0.1952. In this context, it can be stated that there is roughly a 20% correlation between these two variables.
At the outset, when constructing a linear model incorporating two variables, namely x1 (representing inactivity) and x2 (representing obesity), the R-squared value for this model is approximately 34%. However, the situation takes an intriguing turn from here.
If we attempt the same procedure, with the key distinction being that we center the variables before constructing the linear model, the resulting R-squared value for this model is approximately 36%. In this instance, it becomes evident that there has been an increase of approximately 2% in the R-squared value compared to the previous approach.