Two sample t-test to show equality of the two means

Clash Royale CLAN TAG#URR8PPP
$begingroup$
Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.
If my goal was to show a significant difference I would formulate the $t$-test as follows:
(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?
(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$
Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?
hypothesis-testing t-test equivalence
$endgroup$
add a comment |
$begingroup$
Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.
If my goal was to show a significant difference I would formulate the $t$-test as follows:
(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?
(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$
Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?
hypothesis-testing t-test equivalence
$endgroup$
1
$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07
$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09
add a comment |
$begingroup$
Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.
If my goal was to show a significant difference I would formulate the $t$-test as follows:
(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?
(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$
Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?
hypothesis-testing t-test equivalence
$endgroup$
Given two (numeric) samples I would like to show that there is not a significant difference between the two means $mu_1$ and $mu_2$.
If my goal was to show a significant difference I would formulate the $t$-test as follows:
(1) $H_0: mu_1 = mu_2$ vs $H_1: mu_1 neq mu_2$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
So then if my goal is to show that there is not a significant difference between both means, should I formulate the test like this?
(2) $H_0: mu_1 neq mu_2$ vs $H_1: mu_1 = mu_2$
Or can I use the first test (1) and when I am not able to reject the null hypothesis say that there is not a significant difference?
hypothesis-testing t-test equivalence
hypothesis-testing t-test equivalence
edited Jan 10 at 21:00
Alexis
16k34597
16k34597
asked Jan 9 at 13:06
cmplx96cmplx96
20316
20316
1
$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07
$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09
add a comment |
1
$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07
$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09
1
1
$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07
$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07
$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09
$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.
The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.
What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.
$endgroup$
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
3
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
add a comment |
$begingroup$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:
The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?
There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.
Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.
$endgroup$
1
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f386317%2ftwo-sample-t-test-to-show-equality-of-the-two-means%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.
The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.
What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.
$endgroup$
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
3
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
add a comment |
$begingroup$
You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.
The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.
What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.
$endgroup$
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
3
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
add a comment |
$begingroup$
You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.
The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.
What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.
$endgroup$
You cannot use the first test in the way you describe, because failure to reject in the first test only says that you were unable to reject $H_0$ nothing more than that. It is like only being given the information that "the prosecutor was unable to provide the jury with enough evidence to secure a conviction" - that does not tell you that the suspect is innocent.
The second test is not usable in practice, because no matter how much data you have, you cannot exclude the possibility of very small differences.
What you can do is to look at
$$H_0: |mu_1 - mu_2|>delta text vs H_1: |mu_1 - mu_2| leq delta,$$
i.e. try to reject the null hypothesis that the absolute size of the difference is greater than some difference $delta>0$. $delta$ would be chosen e.g. so that any difference smaller than that is for all (or your specific) practical purposes irrelevant.
answered Jan 9 at 13:21
BjörnBjörn
10.5k11039
10.5k11039
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
3
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
add a comment |
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
3
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
$begingroup$
Thanks! How would I then go about computing the test statistic? t = (x1 - x1 - delta) / sqrt(s1^2/n1 + s2^2/n2) ?
$endgroup$
– cmplx96
Jan 9 at 13:39
3
3
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Not quite, you look in both directions, i.e. do two one-sided tests (en.wikipedia.org/wiki/Equivalence_test). Formulae for these are given e.g. here ncss.wpengine.netdna-cdn.com/wp-content/themes/ncss/pdf/….
$endgroup$
– Björn
Jan 9 at 16:27
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
$begingroup$
Note that we have a considerably informative tag on two one-sided tests here. That's what the [tost] tag is for. :)
$endgroup$
– Alexis
Jan 10 at 19:08
add a comment |
$begingroup$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:
The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?
There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.
Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.
$endgroup$
1
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
add a comment |
$begingroup$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:
The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?
There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.
Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.
$endgroup$
1
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
add a comment |
$begingroup$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:
The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?
There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.
Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.
$endgroup$
I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show.
That is not accurate explanation of the null hypothesis. The null hypothesis is simply a hypothesis that consists of a specific distribution from which probabilities can be calculated. The reason we use $mu_1=mu_2$ as the null hypothesis has nothing to do with whether this is the "common" belief. It's used as the null hypothesis because if we hypothesize that the mean is a specific value, then given a particular set of data we can calculate the probability of seeing that data. We can't use $mu neq mu_2$ as our null hypothesis because there's no way to calculate p-values based simply on the hypothesis that the means aren't equal to a particular value. Consider the following problem:
The weights of apples have a standard deviation of 5 grams. The mean is not equal to 100. What is the probability of seeing an apple with a weight of 110 grams?
There's no way to answer that, because simply being told what the mean isn't is not enough to calculate probabilities.
Björn suggests testing the hypothesis that the difference in means is greater than some $delta_0$. How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$. Then once you have the data, you can calculate the p-value given that $delta_0$. Call that $p_delta_0$. If the difference in sample means is less than $delta_0$, then the the p-value would have been even smaller than $p_delta_0$ if we had chosen $delta$ to be larger than $delta_0$. We reject the null if the p-value is less than $alpha$, so if we're rejecting under that null, that means that $p_delta_0 < alpha$. And since $p_delta<p_delta_0$ for any $delta>delta_0$, we can conclude that $p_delta<alpha$ for any $delta>delta_0$. Thus, we can not only reject this null of $delta_0$, but we can reject any null with a larger $delta$. It is only because of this ability to get an upper bound on p that we don't need a specific value for $delta$. If we just take "$delta$ is larger than zero" as our null hypothesis, without any lower bound for $delta$, then there is no upper bound for p, and so we cannot conclude that it is lower than $alpha$.
answered Jan 10 at 19:05
AcccumulationAcccumulation
1,56626
1,56626
1
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
add a comment |
1
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
1
1
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
You seem quite confused about how two one-sided tests for equivalence work: "How that would work is to take the null hypothesis as that the difference is equal to exactly $delta_0$." is not remotely close to these procedures.
$endgroup$
– Alexis
Jan 10 at 19:12
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
$begingroup$
@Alexis That is the rigorous mathematical theoretical foundation of the process. Certainly, there are people doing statistics in the field that are not engaging in full rigor.
$endgroup$
– Acccumulation
Jan 10 at 19:15
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f386317%2ftwo-sample-t-test-to-show-equality-of-the-two-means%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
The alternative hypothesis indicates what an extreme result might look like. The problem with your (2) formulation is that this would be a difference in means close to $0$; so if you took a commonly used significance level of $5%$ then the power of the test (its ability to reject the null hypothesis when it is false) would never be above $5%$ no matter how large the sample size. This is not good
$endgroup$
– Henry
Jan 9 at 15:07
$begingroup$
"I learned in school that the null hypothesis should always represent the "common" belief and the alternative hypothesis should represent the change that I would like to show." Then you were taught to commit to confirmation bias as a mode of scientific inquiry.
$endgroup$
– Alexis
Jan 10 at 19:09