What does it mean when I add a new variable to my linear model and the R^2 stays the same?In what cases (if any) does r^2 remain unchanged on adding a new variable?How can I predict values from new inputs of a linear model in R?Does a stepwise approach produce the highest $R^2$ model?F test and t test in linear regression modelCompare linear regression models (same and different response variable)In linear model, if you add one more variable, then what happens to the constant?Getting estimate and CI for dummy variable in linear modelCircularity in Linear Regression: Independent variable used as dependent in the same modelWhat is the difference between generalized linear models and generalized least squaresPCA without response variable to get linearly dependent set of linear (mixed) model inputswhy does adding new variables to a regression model keep R squared unchanged
What is it called when someone votes for an option that's not their first choice?
How do you say "Trust your struggle." in French?
Connection Between Knot Theory and Number Theory
Not hide and seek
Mortal danger in mid-grade literature
Calculate Pi using Monte Carlo
Center page as a whole without centering each element individually
Why is implicit conversion not ambiguous for non-primitive types?
Are all namekians brothers?
How do you justify more code being written by following clean code practices?
Do people actually use the word "kaputt" in conversation?
How do I prevent inappropriate ads from appearing in my game?
How can I, as DM, avoid the Conga Line of Death occurring when implementing some form of flanking rule?
What is the period/term used describe Giuseppe Arcimboldo's style of painting?
What (if any) is the reason to buy in small local stores?
Asserting that Atheism and Theism are both faith based positions
Why can't I get pgrep output right to variable on bash script?
Hashing password to increase entropy
Can you take a "free object interaction" while incapacitated?
What is the purpose of using a decision tree?
How to get directions in deep space?
Friend wants my recommendation but I don't want to give it to him
What should be the ideal length of sentences in a blog post for ease of reading?
Why would five hundred and five same as one?
What does it mean when I add a new variable to my linear model and the R^2 stays the same?
In what cases (if any) does r^2 remain unchanged on adding a new variable?How can I predict values from new inputs of a linear model in R?Does a stepwise approach produce the highest $R^2$ model?F test and t test in linear regression modelCompare linear regression models (same and different response variable)In linear model, if you add one more variable, then what happens to the constant?Getting estimate and CI for dummy variable in linear modelCircularity in Linear Regression: Independent variable used as dependent in the same modelWhat is the difference between generalized linear models and generalized least squaresPCA without response variable to get linearly dependent set of linear (mixed) model inputswhy does adding new variables to a regression model keep R squared unchanged
$begingroup$
I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?
linear-model r-squared
$endgroup$
|
show 2 more comments
$begingroup$
I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?
linear-model r-squared
$endgroup$
$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
Mar 7 at 19:54
5
$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung♦
Mar 7 at 20:02
5
$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber♦
Mar 7 at 20:12
$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung♦
Mar 7 at 20:17
$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
Mar 7 at 22:33
|
show 2 more comments
$begingroup$
I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?
linear-model r-squared
$endgroup$
I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?
linear-model r-squared
linear-model r-squared
asked Mar 7 at 19:46
Chance113Chance113
362
362
$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
Mar 7 at 19:54
5
$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung♦
Mar 7 at 20:02
5
$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber♦
Mar 7 at 20:12
$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung♦
Mar 7 at 20:17
$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
Mar 7 at 22:33
|
show 2 more comments
$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
Mar 7 at 19:54
5
$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung♦
Mar 7 at 20:02
5
$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber♦
Mar 7 at 20:12
$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung♦
Mar 7 at 20:17
$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
Mar 7 at 22:33
$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
Mar 7 at 19:54
$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
Mar 7 at 19:54
5
5
$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung♦
Mar 7 at 20:02
$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung♦
Mar 7 at 20:02
5
5
$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber♦
Mar 7 at 20:12
$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber♦
Mar 7 at 20:12
$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung♦
Mar 7 at 20:17
$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung♦
Mar 7 at 20:17
$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
Mar 7 at 22:33
$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
Mar 7 at 22:33
|
show 2 more comments
2 Answers
2
active
oldest
votes
$begingroup$
Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.
$endgroup$
add a comment |
$begingroup$
As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.
As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.
More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.
In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.
$endgroup$
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396220%2fwhat-does-it-mean-when-i-add-a-new-variable-to-my-linear-model-and-the-r2-stays%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.
$endgroup$
add a comment |
$begingroup$
Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.
$endgroup$
add a comment |
$begingroup$
Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.
$endgroup$
Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.
answered Mar 7 at 19:55
TrynnaDoStatTrynnaDoStat
5,57211335
5,57211335
add a comment |
add a comment |
$begingroup$
As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.
As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.
More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.
In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.
$endgroup$
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
add a comment |
$begingroup$
As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.
As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.
More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.
In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.
$endgroup$
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
add a comment |
$begingroup$
As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.
As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.
More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.
In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.
$endgroup$
As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.
As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.
More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.
In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.
answered Mar 7 at 20:13
user5957401user5957401
29727
29727
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
add a comment |
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
$begingroup$
Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
$endgroup$
– Richard Hardy
Mar 7 at 21:38
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396220%2fwhat-does-it-mean-when-i-add-a-new-variable-to-my-linear-model-and-the-r2-stays%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
Mar 7 at 19:54
5
$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung♦
Mar 7 at 20:02
5
$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber♦
Mar 7 at 20:12
$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung♦
Mar 7 at 20:17
$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
Mar 7 at 22:33