QQ Plot and Shapiro-Wilk Test Disagree

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

My QQ Plot shows that the data is not normally distributed

qqplot(residual_values, fit = True, line = '45')
pylab.show()

enter image description here

It has a skewness of 0.54

residual_values.skew() # 0.5469389365591185

But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed

shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)

What is the correct inference from this? Is it normally distributed or not?

edited Mar 11 at 8:58

Nick Cox

39.2k587131

asked Mar 10 at 18:27

Shinigami

237

3

$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
Mar 10 at 20:24

3

$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
Mar 10 at 21:07

1

$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
Mar 10 at 23:09

add a comment |

My QQ Plot shows that the data is not normally distributed

qqplot(residual_values, fit = True, line = '45')
pylab.show()

enter image description here

It has a skewness of 0.54

residual_values.skew() # 0.5469389365591185

But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed

shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)

What is the correct inference from this? Is it normally distributed or not?

edited Mar 11 at 8:58

Nick Cox

39.2k587131

asked Mar 10 at 18:27

Shinigami

237

3

$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
Mar 10 at 20:24

3

$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
Mar 10 at 21:07

1

$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
Mar 10 at 23:09

add a comment |

My QQ Plot shows that the data is not normally distributed

qqplot(residual_values, fit = True, line = '45')
pylab.show()

enter image description here

It has a skewness of 0.54

residual_values.skew() # 0.5469389365591185

But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed

shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)

What is the correct inference from this? Is it normally distributed or not?

edited Mar 11 at 8:58

Nick Cox

39.2k587131

asked Mar 10 at 18:27

Shinigami

237

My QQ Plot shows that the data is not normally distributed

qqplot(residual_values, fit = True, line = '45')
pylab.show()

enter image description here

It has a skewness of 0.54

residual_values.skew() # 0.5469389365591185

But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed

shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)

What is the correct inference from this? Is it normally distributed or not?

regression machine-learning

edited Mar 11 at 8:58

Nick Cox

39.2k587131

asked Mar 10 at 18:27

Shinigami

237

edited Mar 11 at 8:58

Nick Cox

39.2k587131

asked Mar 10 at 18:27

Shinigami

237

edited Mar 11 at 8:58

Nick Cox

39.2k587131

edited Mar 11 at 8:58

Nick Cox

39.2k587131

edited Mar 11 at 8:58

Nick Cox

39.2k587131

asked Mar 10 at 18:27

Shinigami

237

asked Mar 10 at 18:27

Shinigami

237

asked Mar 10 at 18:27

Shinigami

237

3

$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
Mar 10 at 20:24

3

$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
Mar 10 at 21:07

1

$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
Mar 10 at 23:09

add a comment |

3

$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
Mar 10 at 20:24

3

$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
Mar 10 at 21:07

1

$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
Mar 10 at 23:09

The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?

– The Laconic
Mar 10 at 20:24

It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.

– Nick Cox
Mar 10 at 21:07

It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)

– Glen_b♦
Mar 10 at 23:09

add a comment |

4 Answers
4

active

oldest

votes

The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.

The Shapiro-Wilk test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.

edited Mar 11 at 8:56

Nick Cox

39.2k587131

answered Mar 10 at 22:02

The Laconic

1,2752615

add a comment |

The q-q is consistent with (not "proving") approximate normality, more or less.

The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.

@The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).

answered Mar 10 at 22:18

LSC

3348

add a comment |

My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).

answered Mar 10 at 22:55

user54285

865

1

$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
Mar 11 at 0:57

add a comment |

The Shapiro-Wilk p-value being >0.05 indicates lack of evidence against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.

edited Mar 11 at 8:57

Nick Cox

39.2k587131

answered Mar 10 at 22:22

beta1_equals_beta2

683

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

edited Mar 11 at 8:56

Nick Cox

39.2k587131

answered Mar 10 at 22:02

The Laconic

1,2752615

add a comment |

edited Mar 11 at 8:56

Nick Cox

39.2k587131

answered Mar 10 at 22:02

The Laconic

1,2752615

add a comment |

edited Mar 11 at 8:56

Nick Cox

39.2k587131

answered Mar 10 at 22:02

The Laconic

1,2752615

edited Mar 11 at 8:56

Nick Cox

39.2k587131

answered Mar 10 at 22:02

The Laconic

1,2752615

edited Mar 11 at 8:56

Nick Cox

39.2k587131

edited Mar 11 at 8:56

Nick Cox

39.2k587131

edited Mar 11 at 8:56

Nick Cox

39.2k587131

answered Mar 10 at 22:02

The Laconic

1,2752615

answered Mar 10 at 22:02

The Laconic

1,2752615

answered Mar 10 at 22:02

The Laconic

1,2752615

add a comment |

The q-q is consistent with (not "proving") approximate normality, more or less.

answered Mar 10 at 22:18

LSC

3348

add a comment |

The q-q is consistent with (not "proving") approximate normality, more or less.

answered Mar 10 at 22:18

LSC

3348

add a comment |

The q-q is consistent with (not "proving") approximate normality, more or less.

answered Mar 10 at 22:18

LSC

3348

The q-q is consistent with (not "proving") approximate normality, more or less.

answered Mar 10 at 22:18

LSC

3348

answered Mar 10 at 22:18

LSC

3348

answered Mar 10 at 22:18

LSC

3348

answered Mar 10 at 22:18

LSC

3348

add a comment |

answered Mar 10 at 22:55

user54285

865

1

$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
Mar 11 at 0:57

add a comment |

answered Mar 10 at 22:55

user54285

865

1

$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
Mar 11 at 0:57

add a comment |

answered Mar 10 at 22:55

user54285

865

answered Mar 10 at 22:55

user54285

865

answered Mar 10 at 22:55

user54285

865

answered Mar 10 at 22:55

user54285

865

answered Mar 10 at 22:55

user54285

865

1

$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
Mar 11 at 0:57

add a comment |

1

$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
Mar 11 at 0:57

I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).

– LSC
Mar 11 at 0:57

add a comment |

edited Mar 11 at 8:57

Nick Cox

39.2k587131

answered Mar 10 at 22:22

beta1_equals_beta2

683

add a comment |

edited Mar 11 at 8:57

Nick Cox

39.2k587131

answered Mar 10 at 22:22

beta1_equals_beta2

683

add a comment |

edited Mar 11 at 8:57

Nick Cox

39.2k587131

answered Mar 10 at 22:22

beta1_equals_beta2

683

edited Mar 11 at 8:57

Nick Cox

39.2k587131

answered Mar 10 at 22:22

beta1_equals_beta2

683

edited Mar 11 at 8:57

Nick Cox

39.2k587131

edited Mar 11 at 8:57

Nick Cox

39.2k587131

edited Mar 11 at 8:57

Nick Cox

39.2k587131

answered Mar 10 at 22:22

beta1_equals_beta2

683

answered Mar 10 at 22:22

beta1_equals_beta2

683

answered Mar 10 at 22:22

beta1_equals_beta2

683

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu