Which corrections should I use? T-test for differences in means with different sample sizes and standard deviations
Clash Royale CLAN TAG#URR8PPP
$begingroup$
I have two samples, coming from different populations.
One sample has 8,000 records, a mean of 5 and a sd of 0.5
The second has 1,500 records, a mean of 7 and a sd of 1.5
The distributions are close to normal.
This is coming from the behaviour of two kind of devices, and I want to understand if the output of one is of higher quality than the other.
Can I apply a $t$-test here? What cautions should I have or which corrections/alternative test do I have?
statistical-significance t-test inference
$endgroup$
add a comment |
$begingroup$
I have two samples, coming from different populations.
One sample has 8,000 records, a mean of 5 and a sd of 0.5
The second has 1,500 records, a mean of 7 and a sd of 1.5
The distributions are close to normal.
This is coming from the behaviour of two kind of devices, and I want to understand if the output of one is of higher quality than the other.
Can I apply a $t$-test here? What cautions should I have or which corrections/alternative test do I have?
statistical-significance t-test inference
$endgroup$
add a comment |
$begingroup$
I have two samples, coming from different populations.
One sample has 8,000 records, a mean of 5 and a sd of 0.5
The second has 1,500 records, a mean of 7 and a sd of 1.5
The distributions are close to normal.
This is coming from the behaviour of two kind of devices, and I want to understand if the output of one is of higher quality than the other.
Can I apply a $t$-test here? What cautions should I have or which corrections/alternative test do I have?
statistical-significance t-test inference
$endgroup$
I have two samples, coming from different populations.
One sample has 8,000 records, a mean of 5 and a sd of 0.5
The second has 1,500 records, a mean of 7 and a sd of 1.5
The distributions are close to normal.
This is coming from the behaviour of two kind of devices, and I want to understand if the output of one is of higher quality than the other.
Can I apply a $t$-test here? What cautions should I have or which corrections/alternative test do I have?
statistical-significance t-test inference
statistical-significance t-test inference
edited Feb 17 at 16:26
StatsStudent
6,01332044
6,01332044
asked Feb 16 at 14:00
LuisLuis
85119
85119
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Assuming your samples are independent, then Welch's t-test does seem to be appropriate here, since it appears you have unequal variances (but you can formally test this too if you want through Levene's Test for Equality of Variances).
That being said, since you have quite large samples from both device 1 and device 2, then you can appeal to the central limit theorem and use:
begineqnarray*
Z & = & fracbarX-barYsqrtfracs_1^2n_1+fracs_2^2n_2sim N(0,1)\
endeqnarray*
under the null hypothesis of equal means. Here, $barX$ and $barY$ and sample means from device 1 and device 2, respectively and $s_i^2$ and $n_i$ are the sample variance and sample sizes from the ith device $i=1,2$. Note that in large sample inference, you don't need to concern yourself with unequal variances.
Then a 95% confidence interval for your estimate would be given by:
begineqnarray*
barX-barY & pm & Z_alpha/2sqrtfracs_1^2n_1+fracs_2^2n_2
endeqnarray*
where $Z_alpha/2$ is the upper $alpha/2$ point of the standard normal distribution.
All this being said, I wholeheartedly agree with the answer provided by Stefan. These sample sizes are really large and he's provided sound advice that you should follow. You should focus on what is an important practical difference. Is a 0.0001 mean difference between device 1 and device 2 important to you? Is it still important if device 1 costs three times as much as device 2?
$endgroup$
add a comment |
$begingroup$
With such a huge sample size almost any slight differences in those two means will be declared significant. Instead, I would try to visualize your samples in different ways to learn more about the shape of the data.
Also how is "higher quality" defined by you? Does it mean that the mean outcomes should be different? Or does it perhaps apply more to the variances between the samples, e.g. less variation more desirable?
Here are some ideas how to visualize the data using R:
require(ggplot2)
require(gridExtra)
d1 <- data.frame(Y = rnorm(8000, 5, 0.5), X = "A")
d2 <- data.frame(Y = rnorm(1500, 7, 1.5), X = "B")
d <- rbind(d1, d2)
p1 <- ggplot(d, aes(Y, group = X)) + geom_density() + ggtitle("Density plot")
p2 <- ggplot(d, aes(X, Y)) + geom_boxplot() + ggtitle("Boxplot")
p3 <- ggplot(d, aes(X, Y)) + geom_violin() + ggtitle("Violin plot")
grid.arrange(p1, p2, p3, ncol = 1)
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392823%2fwhich-corrections-should-i-use-t-test-for-differences-in-means-with-different-s%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Assuming your samples are independent, then Welch's t-test does seem to be appropriate here, since it appears you have unequal variances (but you can formally test this too if you want through Levene's Test for Equality of Variances).
That being said, since you have quite large samples from both device 1 and device 2, then you can appeal to the central limit theorem and use:
begineqnarray*
Z & = & fracbarX-barYsqrtfracs_1^2n_1+fracs_2^2n_2sim N(0,1)\
endeqnarray*
under the null hypothesis of equal means. Here, $barX$ and $barY$ and sample means from device 1 and device 2, respectively and $s_i^2$ and $n_i$ are the sample variance and sample sizes from the ith device $i=1,2$. Note that in large sample inference, you don't need to concern yourself with unequal variances.
Then a 95% confidence interval for your estimate would be given by:
begineqnarray*
barX-barY & pm & Z_alpha/2sqrtfracs_1^2n_1+fracs_2^2n_2
endeqnarray*
where $Z_alpha/2$ is the upper $alpha/2$ point of the standard normal distribution.
All this being said, I wholeheartedly agree with the answer provided by Stefan. These sample sizes are really large and he's provided sound advice that you should follow. You should focus on what is an important practical difference. Is a 0.0001 mean difference between device 1 and device 2 important to you? Is it still important if device 1 costs three times as much as device 2?
$endgroup$
add a comment |
$begingroup$
Assuming your samples are independent, then Welch's t-test does seem to be appropriate here, since it appears you have unequal variances (but you can formally test this too if you want through Levene's Test for Equality of Variances).
That being said, since you have quite large samples from both device 1 and device 2, then you can appeal to the central limit theorem and use:
begineqnarray*
Z & = & fracbarX-barYsqrtfracs_1^2n_1+fracs_2^2n_2sim N(0,1)\
endeqnarray*
under the null hypothesis of equal means. Here, $barX$ and $barY$ and sample means from device 1 and device 2, respectively and $s_i^2$ and $n_i$ are the sample variance and sample sizes from the ith device $i=1,2$. Note that in large sample inference, you don't need to concern yourself with unequal variances.
Then a 95% confidence interval for your estimate would be given by:
begineqnarray*
barX-barY & pm & Z_alpha/2sqrtfracs_1^2n_1+fracs_2^2n_2
endeqnarray*
where $Z_alpha/2$ is the upper $alpha/2$ point of the standard normal distribution.
All this being said, I wholeheartedly agree with the answer provided by Stefan. These sample sizes are really large and he's provided sound advice that you should follow. You should focus on what is an important practical difference. Is a 0.0001 mean difference between device 1 and device 2 important to you? Is it still important if device 1 costs three times as much as device 2?
$endgroup$
add a comment |
$begingroup$
Assuming your samples are independent, then Welch's t-test does seem to be appropriate here, since it appears you have unequal variances (but you can formally test this too if you want through Levene's Test for Equality of Variances).
That being said, since you have quite large samples from both device 1 and device 2, then you can appeal to the central limit theorem and use:
begineqnarray*
Z & = & fracbarX-barYsqrtfracs_1^2n_1+fracs_2^2n_2sim N(0,1)\
endeqnarray*
under the null hypothesis of equal means. Here, $barX$ and $barY$ and sample means from device 1 and device 2, respectively and $s_i^2$ and $n_i$ are the sample variance and sample sizes from the ith device $i=1,2$. Note that in large sample inference, you don't need to concern yourself with unequal variances.
Then a 95% confidence interval for your estimate would be given by:
begineqnarray*
barX-barY & pm & Z_alpha/2sqrtfracs_1^2n_1+fracs_2^2n_2
endeqnarray*
where $Z_alpha/2$ is the upper $alpha/2$ point of the standard normal distribution.
All this being said, I wholeheartedly agree with the answer provided by Stefan. These sample sizes are really large and he's provided sound advice that you should follow. You should focus on what is an important practical difference. Is a 0.0001 mean difference between device 1 and device 2 important to you? Is it still important if device 1 costs three times as much as device 2?
$endgroup$
Assuming your samples are independent, then Welch's t-test does seem to be appropriate here, since it appears you have unequal variances (but you can formally test this too if you want through Levene's Test for Equality of Variances).
That being said, since you have quite large samples from both device 1 and device 2, then you can appeal to the central limit theorem and use:
begineqnarray*
Z & = & fracbarX-barYsqrtfracs_1^2n_1+fracs_2^2n_2sim N(0,1)\
endeqnarray*
under the null hypothesis of equal means. Here, $barX$ and $barY$ and sample means from device 1 and device 2, respectively and $s_i^2$ and $n_i$ are the sample variance and sample sizes from the ith device $i=1,2$. Note that in large sample inference, you don't need to concern yourself with unequal variances.
Then a 95% confidence interval for your estimate would be given by:
begineqnarray*
barX-barY & pm & Z_alpha/2sqrtfracs_1^2n_1+fracs_2^2n_2
endeqnarray*
where $Z_alpha/2$ is the upper $alpha/2$ point of the standard normal distribution.
All this being said, I wholeheartedly agree with the answer provided by Stefan. These sample sizes are really large and he's provided sound advice that you should follow. You should focus on what is an important practical difference. Is a 0.0001 mean difference between device 1 and device 2 important to you? Is it still important if device 1 costs three times as much as device 2?
edited Feb 16 at 15:37
answered Feb 16 at 15:13
StatsStudentStatsStudent
6,01332044
6,01332044
add a comment |
add a comment |
$begingroup$
With such a huge sample size almost any slight differences in those two means will be declared significant. Instead, I would try to visualize your samples in different ways to learn more about the shape of the data.
Also how is "higher quality" defined by you? Does it mean that the mean outcomes should be different? Or does it perhaps apply more to the variances between the samples, e.g. less variation more desirable?
Here are some ideas how to visualize the data using R:
require(ggplot2)
require(gridExtra)
d1 <- data.frame(Y = rnorm(8000, 5, 0.5), X = "A")
d2 <- data.frame(Y = rnorm(1500, 7, 1.5), X = "B")
d <- rbind(d1, d2)
p1 <- ggplot(d, aes(Y, group = X)) + geom_density() + ggtitle("Density plot")
p2 <- ggplot(d, aes(X, Y)) + geom_boxplot() + ggtitle("Boxplot")
p3 <- ggplot(d, aes(X, Y)) + geom_violin() + ggtitle("Violin plot")
grid.arrange(p1, p2, p3, ncol = 1)
$endgroup$
add a comment |
$begingroup$
With such a huge sample size almost any slight differences in those two means will be declared significant. Instead, I would try to visualize your samples in different ways to learn more about the shape of the data.
Also how is "higher quality" defined by you? Does it mean that the mean outcomes should be different? Or does it perhaps apply more to the variances between the samples, e.g. less variation more desirable?
Here are some ideas how to visualize the data using R:
require(ggplot2)
require(gridExtra)
d1 <- data.frame(Y = rnorm(8000, 5, 0.5), X = "A")
d2 <- data.frame(Y = rnorm(1500, 7, 1.5), X = "B")
d <- rbind(d1, d2)
p1 <- ggplot(d, aes(Y, group = X)) + geom_density() + ggtitle("Density plot")
p2 <- ggplot(d, aes(X, Y)) + geom_boxplot() + ggtitle("Boxplot")
p3 <- ggplot(d, aes(X, Y)) + geom_violin() + ggtitle("Violin plot")
grid.arrange(p1, p2, p3, ncol = 1)
$endgroup$
add a comment |
$begingroup$
With such a huge sample size almost any slight differences in those two means will be declared significant. Instead, I would try to visualize your samples in different ways to learn more about the shape of the data.
Also how is "higher quality" defined by you? Does it mean that the mean outcomes should be different? Or does it perhaps apply more to the variances between the samples, e.g. less variation more desirable?
Here are some ideas how to visualize the data using R:
require(ggplot2)
require(gridExtra)
d1 <- data.frame(Y = rnorm(8000, 5, 0.5), X = "A")
d2 <- data.frame(Y = rnorm(1500, 7, 1.5), X = "B")
d <- rbind(d1, d2)
p1 <- ggplot(d, aes(Y, group = X)) + geom_density() + ggtitle("Density plot")
p2 <- ggplot(d, aes(X, Y)) + geom_boxplot() + ggtitle("Boxplot")
p3 <- ggplot(d, aes(X, Y)) + geom_violin() + ggtitle("Violin plot")
grid.arrange(p1, p2, p3, ncol = 1)
$endgroup$
With such a huge sample size almost any slight differences in those two means will be declared significant. Instead, I would try to visualize your samples in different ways to learn more about the shape of the data.
Also how is "higher quality" defined by you? Does it mean that the mean outcomes should be different? Or does it perhaps apply more to the variances between the samples, e.g. less variation more desirable?
Here are some ideas how to visualize the data using R:
require(ggplot2)
require(gridExtra)
d1 <- data.frame(Y = rnorm(8000, 5, 0.5), X = "A")
d2 <- data.frame(Y = rnorm(1500, 7, 1.5), X = "B")
d <- rbind(d1, d2)
p1 <- ggplot(d, aes(Y, group = X)) + geom_density() + ggtitle("Density plot")
p2 <- ggplot(d, aes(X, Y)) + geom_boxplot() + ggtitle("Boxplot")
p3 <- ggplot(d, aes(X, Y)) + geom_violin() + ggtitle("Violin plot")
grid.arrange(p1, p2, p3, ncol = 1)
edited Feb 19 at 14:53
answered Feb 16 at 15:20
StefanStefan
3,5811931
3,5811931
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392823%2fwhich-corrections-should-i-use-t-test-for-differences-in-means-with-different-s%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown