Two sample t-test to show equality of the two means

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












6












$begingroup$


Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.



If my goal was to show a significant difference I would formulate the $t$-test as follows:



(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$



I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.



So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?



(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$



Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?










share|cite|improve this question











$endgroup$







  • 1




    $begingroup$
    The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
    $endgroup$
    – Henry
    Jan 9 at 15:07










  • $begingroup$
    "I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
    $endgroup$
    – Alexis
    Jan 10 at 19:09















6












$begingroup$


Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.



If my goal was to show a significant difference I would formulate the $t$-test as follows:



(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$



I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.



So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?



(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$



Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?










share|cite|improve this question











$endgroup$







  • 1




    $begingroup$
    The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
    $endgroup$
    – Henry
    Jan 9 at 15:07










  • $begingroup$
    "I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
    $endgroup$
    – Alexis
    Jan 10 at 19:09













6












6








6


3



$begingroup$


Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.



If my goal was to show a significant difference I would formulate the $t$-test as follows:



(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$



I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.



So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?



(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$



Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?










share|cite|improve this question











$endgroup$




Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.



If my goal was to show a significant difference I would formulate the $t$-test as follows:



(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$



I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.



So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?



(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$



Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?







hypothesis-testing t-test equivalence






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jan 10 at 21:00









Alexis

16k34597




16k34597










asked Jan 9 at 13:06









cmplx96cmplx96

20316




20316







  • 1




    $begingroup$
    The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
    $endgroup$
    – Henry
    Jan 9 at 15:07










  • $begingroup$
    "I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
    $endgroup$
    – Alexis
    Jan 10 at 19:09












  • 1




    $begingroup$
    The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
    $endgroup$
    – Henry
    Jan 9 at 15:07










  • $begingroup$
    "I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
    $endgroup$
    – Alexis
    Jan 10 at 19:09







1




1




$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07




$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07












$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09




$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09










2 Answers
2






active

oldest

votes


















11












$begingroup$

You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.



The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.



What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
    $endgroup$
    – cmplx96
    Jan 9 at 13:39







  • 3




    $begingroup$
    Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
    $endgroup$
    – Björn
    Jan 9 at 16:27










  • $begingroup$
    Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
    $endgroup$
    – Alexis
    Jan 10 at 19:08


















-1












$begingroup$


I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.




That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:




The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?




There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.



Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.






share|cite|improve this answer









$endgroup$








  • 1




    $begingroup$
    You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
    $endgroup$
    – Alexis
    Jan 10 at 19:12










  • $begingroup$
    @Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
    $endgroup$
    – Acccumulation
    Jan 10 at 19:15










Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f386317%2ftwo-sample-t-test-to-show-equality-of-the-two-means%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









11












$begingroup$

You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.



The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.



What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
    $endgroup$
    – cmplx96
    Jan 9 at 13:39







  • 3




    $begingroup$
    Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
    $endgroup$
    – Björn
    Jan 9 at 16:27










  • $begingroup$
    Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
    $endgroup$
    – Alexis
    Jan 10 at 19:08















11












$begingroup$

You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.



The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.



What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
    $endgroup$
    – cmplx96
    Jan 9 at 13:39







  • 3




    $begingroup$
    Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
    $endgroup$
    – Björn
    Jan 9 at 16:27










  • $begingroup$
    Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
    $endgroup$
    – Alexis
    Jan 10 at 19:08













11












11








11





$begingroup$

You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.



The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.



What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.






share|cite|improve this answer









$endgroup$



You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.



The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.



What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered Jan 9 at 13:21









BjörnBjörn

10.5k11039




10.5k11039











  • $begingroup$
    Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
    $endgroup$
    – cmplx96
    Jan 9 at 13:39







  • 3




    $begingroup$
    Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
    $endgroup$
    – Björn
    Jan 9 at 16:27










  • $begingroup$
    Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
    $endgroup$
    – Alexis
    Jan 10 at 19:08
















  • $begingroup$
    Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
    $endgroup$
    – cmplx96
    Jan 9 at 13:39







  • 3




    $begingroup$
    Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
    $endgroup$
    – Björn
    Jan 9 at 16:27










  • $begingroup$
    Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
    $endgroup$
    – Alexis
    Jan 10 at 19:08















$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39





$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39





3




3




$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27




$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27












$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08




$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08













-1












$begingroup$


I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.




That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:




The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?




There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.



Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.






share|cite|improve this answer









$endgroup$








  • 1




    $begingroup$
    You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
    $endgroup$
    – Alexis
    Jan 10 at 19:12










  • $begingroup$
    @Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
    $endgroup$
    – Acccumulation
    Jan 10 at 19:15















-1












$begingroup$


I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.




That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:




The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?




There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.



Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.






share|cite|improve this answer









$endgroup$








  • 1




    $begingroup$
    You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
    $endgroup$
    – Alexis
    Jan 10 at 19:12










  • $begingroup$
    @Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
    $endgroup$
    – Acccumulation
    Jan 10 at 19:15













-1












-1








-1





$begingroup$


I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.




That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:




The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?




There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.



Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.






share|cite|improve this answer









$endgroup$




I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.




That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:




The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?




There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.



Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered Jan 10 at 19:05









AcccumulationAcccumulation

1,56626




1,56626







  • 1




    $begingroup$
    You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
    $endgroup$
    – Alexis
    Jan 10 at 19:12










  • $begingroup$
    @Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
    $endgroup$
    – Acccumulation
    Jan 10 at 19:15












  • 1




    $begingroup$
    You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
    $endgroup$
    – Alexis
    Jan 10 at 19:12










  • $begingroup$
    @Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
    $endgroup$
    – Acccumulation
    Jan 10 at 19:15







1




1




$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12




$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12












$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15




$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15

















draft saved

draft discarded
















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f386317%2ftwo-sample-t-test-to-show-equality-of-the-two-means%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown