QQ Plot and Shapiro-Wilk Test Disagree

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew() # 0.5469389365591185


But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this? Is it normally distributed or not?










share|cite|improve this question











$endgroup$







  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    Mar 10 at 20:24






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    Mar 10 at 21:07






  • 1




    $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    Mar 10 at 23:09


















2












$begingroup$


My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew() # 0.5469389365591185


But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this? Is it normally distributed or not?










share|cite|improve this question











$endgroup$







  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    Mar 10 at 20:24






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    Mar 10 at 21:07






  • 1




    $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    Mar 10 at 23:09














2












2








2





$begingroup$


My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew() # 0.5469389365591185


But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this? Is it normally distributed or not?










share|cite|improve this question











$endgroup$




My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew() # 0.5469389365591185


But the p_value of Shapiro-Wilk test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this? Is it normally distributed or not?







regression machine-learning






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Mar 11 at 8:58









Nick Cox

39.2k587131




39.2k587131










asked Mar 10 at 18:27









ShinigamiShinigami

237




237







  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    Mar 10 at 20:24






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    Mar 10 at 21:07






  • 1




    $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    Mar 10 at 23:09













  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    Mar 10 at 20:24






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    Mar 10 at 21:07






  • 1




    $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    Mar 10 at 23:09








3




3




$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
Mar 10 at 20:24




$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
Mar 10 at 20:24




3




3




$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
Mar 10 at 21:07




$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
Mar 10 at 21:07




1




1




$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b
Mar 10 at 23:09





$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b
Mar 10 at 23:09











4 Answers
4






active

oldest

votes


















2












$begingroup$

The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



The Shapiro-Wilk test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






share|cite|improve this answer











$endgroup$




















    4












    $begingroup$

    The q-q is consistent with (not "proving") approximate normality, more or less.



    The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



    @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






    share|cite|improve this answer









    $endgroup$




















      1












      $begingroup$

      My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






      share|cite|improve this answer









      $endgroup$








      • 1




        $begingroup$
        I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
        $endgroup$
        – LSC
        Mar 11 at 0:57



















      1












      $begingroup$

      The Shapiro-Wilk p-value being >0.05 indicates lack of evidence against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






      share|cite|improve this answer











      $endgroup$













        Your Answer





        StackExchange.ifUsing("editor", function ()
        return StackExchange.using("mathjaxEditing", function ()
        StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
        StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
        );
        );
        , "mathjax-editing");

        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "65"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        4 Answers
        4






        active

        oldest

        votes








        4 Answers
        4






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        2












        $begingroup$

        The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



        The Shapiro-Wilk test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






        share|cite|improve this answer











        $endgroup$

















          2












          $begingroup$

          The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



          The Shapiro-Wilk test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






          share|cite|improve this answer











          $endgroup$















            2












            2








            2





            $begingroup$

            The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



            The Shapiro-Wilk test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






            share|cite|improve this answer











            $endgroup$



            The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



            The Shapiro-Wilk test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited Mar 11 at 8:56









            Nick Cox

            39.2k587131




            39.2k587131










            answered Mar 10 at 22:02









            The LaconicThe Laconic

            1,2752615




            1,2752615























                4












                $begingroup$

                The q-q is consistent with (not "proving") approximate normality, more or less.



                The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






                share|cite|improve this answer









                $endgroup$

















                  4












                  $begingroup$

                  The q-q is consistent with (not "proving") approximate normality, more or less.



                  The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                  @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






                  share|cite|improve this answer









                  $endgroup$















                    4












                    4








                    4





                    $begingroup$

                    The q-q is consistent with (not "proving") approximate normality, more or less.



                    The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                    @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






                    share|cite|improve this answer









                    $endgroup$



                    The q-q is consistent with (not "proving") approximate normality, more or less.



                    The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                    @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).







                    share|cite|improve this answer












                    share|cite|improve this answer



                    share|cite|improve this answer










                    answered Mar 10 at 22:18









                    LSCLSC

                    3348




                    3348





















                        1












                        $begingroup$

                        My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






                        share|cite|improve this answer









                        $endgroup$








                        • 1




                          $begingroup$
                          I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
                          $endgroup$
                          – LSC
                          Mar 11 at 0:57
















                        1












                        $begingroup$

                        My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






                        share|cite|improve this answer









                        $endgroup$








                        • 1




                          $begingroup$
                          I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
                          $endgroup$
                          – LSC
                          Mar 11 at 0:57














                        1












                        1








                        1





                        $begingroup$

                        My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






                        share|cite|improve this answer









                        $endgroup$



                        My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).







                        share|cite|improve this answer












                        share|cite|improve this answer



                        share|cite|improve this answer










                        answered Mar 10 at 22:55









                        user54285user54285

                        865




                        865







                        • 1




                          $begingroup$
                          I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
                          $endgroup$
                          – LSC
                          Mar 11 at 0:57













                        • 1




                          $begingroup$
                          I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
                          $endgroup$
                          – LSC
                          Mar 11 at 0:57








                        1




                        1




                        $begingroup$
                        I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
                        $endgroup$
                        – LSC
                        Mar 11 at 0:57





                        $begingroup$
                        I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
                        $endgroup$
                        – LSC
                        Mar 11 at 0:57












                        1












                        $begingroup$

                        The Shapiro-Wilk p-value being >0.05 indicates lack of evidence against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






                        share|cite|improve this answer











                        $endgroup$

















                          1












                          $begingroup$

                          The Shapiro-Wilk p-value being >0.05 indicates lack of evidence against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






                          share|cite|improve this answer











                          $endgroup$















                            1












                            1








                            1





                            $begingroup$

                            The Shapiro-Wilk p-value being >0.05 indicates lack of evidence against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






                            share|cite|improve this answer











                            $endgroup$



                            The Shapiro-Wilk p-value being >0.05 indicates lack of evidence against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.







                            share|cite|improve this answer














                            share|cite|improve this answer



                            share|cite|improve this answer








                            edited Mar 11 at 8:57









                            Nick Cox

                            39.2k587131




                            39.2k587131










                            answered Mar 10 at 22:22









                            beta1_equals_beta2beta1_equals_beta2

                            683




                            683



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Cross Validated!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown






                                Popular posts from this blog

                                How to check contact read email or not when send email to Individual?

                                Bahrain

                                Postfix configuration issue with fips on centos 7; mailgun relay