Simple linear regression: If Y and X are both normal, what's the exact null distribution of the parameters?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












5














Suppose $Y simN(a,b)$, $X simN(c,d)$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_0+beta_1X + epsilon$. I wish to test the hypothesis $H_0: beta_0=0$ against the alternative $H_1: beta_0neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_0$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_0$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.










share|cite|improve this question



















  • 6




    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    – Silverfish
    Dec 23 '18 at 9:18











  • This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    – kjetil b halvorsen
    Dec 23 '18 at 9:59











  • Yes, I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    – Anna Efron
    Dec 24 '18 at 6:16















5














Suppose $Y simN(a,b)$, $X simN(c,d)$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_0+beta_1X + epsilon$. I wish to test the hypothesis $H_0: beta_0=0$ against the alternative $H_1: beta_0neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_0$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_0$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.










share|cite|improve this question



















  • 6




    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    – Silverfish
    Dec 23 '18 at 9:18











  • This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    – kjetil b halvorsen
    Dec 23 '18 at 9:59











  • Yes, I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    – Anna Efron
    Dec 24 '18 at 6:16













5












5








5







Suppose $Y simN(a,b)$, $X simN(c,d)$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_0+beta_1X + epsilon$. I wish to test the hypothesis $H_0: beta_0=0$ against the alternative $H_1: beta_0neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_0$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_0$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.










share|cite|improve this question















Suppose $Y simN(a,b)$, $X simN(c,d)$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_0+beta_1X + epsilon$. I wish to test the hypothesis $H_0: beta_0=0$ against the alternative $H_1: beta_0neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_0$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_0$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.







regression inference






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 23 '18 at 9:16









Silverfish

14.8k1564142




14.8k1564142










asked Dec 23 '18 at 4:56









Anna Efron

363




363







  • 6




    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    – Silverfish
    Dec 23 '18 at 9:18











  • This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    – kjetil b halvorsen
    Dec 23 '18 at 9:59











  • Yes, I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    – Anna Efron
    Dec 24 '18 at 6:16












  • 6




    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    – Silverfish
    Dec 23 '18 at 9:18











  • This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    – kjetil b halvorsen
    Dec 23 '18 at 9:59











  • Yes, I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    – Anna Efron
    Dec 24 '18 at 6:16







6




6




I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
– Silverfish
Dec 23 '18 at 9:18





I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
– Silverfish
Dec 23 '18 at 9:18













This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
– kjetil b halvorsen
Dec 23 '18 at 9:59





This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
– kjetil b halvorsen
Dec 23 '18 at 9:59













Yes, I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
– Anna Efron
Dec 24 '18 at 6:16




Yes, I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
– Anna Efron
Dec 24 '18 at 6:16










2 Answers
2






active

oldest

votes


















5














Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



$$mathbbE(Y|X) = mathbbE(Y) = c,$$



which implies that:



$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim textN(0, d).$$



In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






share|cite|improve this answer




















  • Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
    – Anna Efron
    Dec 24 '18 at 6:21







  • 1




    Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
    – Ben
    Dec 24 '18 at 6:44


















0














In simple linear regression the computation of the estimate of $beta_0$ is:



$$hatbeta_0 = frac 1n S_y + frac 1n S_x frac n S_xy - S_x S_y n S_xx - S_x S_x$$



with $S_x = sum x_i $, $S_y = sum y_i $, $S_xx = sum x_i x_i $, $S_xy = sum x_i y_i $



You could say it will be a linear sum of the $y_i$



$$hatbeta_0 = frac 1 n sum c_i y_i $$



with



$$c_i =left( 1 + frac n x_i - S_xn S_xx - S_x S_x right) $$



This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



$$hatbeta_0 sim N(mu, sigma^2)$$



where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






share|cite|improve this answer






















    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384254%2fsimple-linear-regression-if-y-and-x-are-both-normal-whats-the-exact-null-dist%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5














    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbbE(Y|X) = mathbbE(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim textN(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






    share|cite|improve this answer




















    • Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
      – Anna Efron
      Dec 24 '18 at 6:21







    • 1




      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      – Ben
      Dec 24 '18 at 6:44















    5














    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbbE(Y|X) = mathbbE(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim textN(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






    share|cite|improve this answer




















    • Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
      – Anna Efron
      Dec 24 '18 at 6:21







    • 1




      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      – Ben
      Dec 24 '18 at 6:44













    5












    5








    5






    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbbE(Y|X) = mathbbE(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim textN(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






    share|cite|improve this answer












    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbbE(Y|X) = mathbbE(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim textN(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Dec 23 '18 at 7:05









    Ben

    21.9k224103




    21.9k224103











    • Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
      – Anna Efron
      Dec 24 '18 at 6:21







    • 1




      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      – Ben
      Dec 24 '18 at 6:44
















    • Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
      – Anna Efron
      Dec 24 '18 at 6:21







    • 1




      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      – Ben
      Dec 24 '18 at 6:44















    Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
    – Anna Efron
    Dec 24 '18 at 6:21





    Thank you. I meant that if I was to test $beta _0=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hatbeta _0$ under the null that $beta _0=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hatbeta _0$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_XY=0.1$ (say) instead of 0?
    – Anna Efron
    Dec 24 '18 at 6:21





    1




    1




    Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
    – Ben
    Dec 24 '18 at 6:44




    Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
    – Ben
    Dec 24 '18 at 6:44













    0














    In simple linear regression the computation of the estimate of $beta_0$ is:



    $$hatbeta_0 = frac 1n S_y + frac 1n S_x frac n S_xy - S_x S_y n S_xx - S_x S_x$$



    with $S_x = sum x_i $, $S_y = sum y_i $, $S_xx = sum x_i x_i $, $S_xy = sum x_i y_i $



    You could say it will be a linear sum of the $y_i$



    $$hatbeta_0 = frac 1 n sum c_i y_i $$



    with



    $$c_i =left( 1 + frac n x_i - S_xn S_xx - S_x S_x right) $$



    This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



    $$hatbeta_0 sim N(mu, sigma^2)$$



    where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



    However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



    In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






    share|cite|improve this answer



























      0














      In simple linear regression the computation of the estimate of $beta_0$ is:



      $$hatbeta_0 = frac 1n S_y + frac 1n S_x frac n S_xy - S_x S_y n S_xx - S_x S_x$$



      with $S_x = sum x_i $, $S_y = sum y_i $, $S_xx = sum x_i x_i $, $S_xy = sum x_i y_i $



      You could say it will be a linear sum of the $y_i$



      $$hatbeta_0 = frac 1 n sum c_i y_i $$



      with



      $$c_i =left( 1 + frac n x_i - S_xn S_xx - S_x S_x right) $$



      This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



      $$hatbeta_0 sim N(mu, sigma^2)$$



      where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



      However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



      In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






      share|cite|improve this answer

























        0












        0








        0






        In simple linear regression the computation of the estimate of $beta_0$ is:



        $$hatbeta_0 = frac 1n S_y + frac 1n S_x frac n S_xy - S_x S_y n S_xx - S_x S_x$$



        with $S_x = sum x_i $, $S_y = sum y_i $, $S_xx = sum x_i x_i $, $S_xy = sum x_i y_i $



        You could say it will be a linear sum of the $y_i$



        $$hatbeta_0 = frac 1 n sum c_i y_i $$



        with



        $$c_i =left( 1 + frac n x_i - S_xn S_xx - S_x S_x right) $$



        This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



        $$hatbeta_0 sim N(mu, sigma^2)$$



        where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



        However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



        In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






        share|cite|improve this answer














        In simple linear regression the computation of the estimate of $beta_0$ is:



        $$hatbeta_0 = frac 1n S_y + frac 1n S_x frac n S_xy - S_x S_y n S_xx - S_x S_x$$



        with $S_x = sum x_i $, $S_y = sum y_i $, $S_xx = sum x_i x_i $, $S_xy = sum x_i y_i $



        You could say it will be a linear sum of the $y_i$



        $$hatbeta_0 = frac 1 n sum c_i y_i $$



        with



        $$c_i =left( 1 + frac n x_i - S_xn S_xx - S_x S_x right) $$



        This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



        $$hatbeta_0 sim N(mu, sigma^2)$$



        where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



        However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



        In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Dec 25 '18 at 13:34

























        answered Dec 25 '18 at 13:08









        Martijn Weterings

        12.5k1355




        12.5k1355



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384254%2fsimple-linear-regression-if-y-and-x-are-both-normal-whats-the-exact-null-dist%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Bahrain

            Postfix configuration issue with fips on centos 7; mailgun relay