Can I use moments of a distribution to sample the distribution?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
I notice in statistics/machine learning methods, a distribution is often approximated by a Gaussian by computing the first two moments, using those to estimate $mu$ and $sigma^2$, and subsequently sample from that Gaussian.
It seems to me the more moments I calculate, the better I ought to be able to approximate the distribution I wish to sample.
What if I calculate 3 moments...how can I use those to sample from the distribution? And can this be extended to N moments?
probability sampling moments
New contributor
add a comment |Â
up vote
3
down vote
favorite
I notice in statistics/machine learning methods, a distribution is often approximated by a Gaussian by computing the first two moments, using those to estimate $mu$ and $sigma^2$, and subsequently sample from that Gaussian.
It seems to me the more moments I calculate, the better I ought to be able to approximate the distribution I wish to sample.
What if I calculate 3 moments...how can I use those to sample from the distribution? And can this be extended to N moments?
probability sampling moments
New contributor
Three moments don't determine a distributional form*; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions. $quad$ [* Indeed, sometimes even having all the moments isn't sufficient to determine a distribution.]
â Glen_bâ¦
3 hours ago
Thanks, @Glen_b ! I will read about the "method of moments" to understand when it's possible. Can you point me to theory that describes when the moments are not sufficient to determine the distribution?
â curious_dan
3 hours ago
"Method of moments" just tells you how to estimate parameters from moments. The remainder of your comment is a new question (I think it's already answered on site); briefly -- if the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (technically, you could in principle do an inverse Laplace transform). Certainly if some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0..
â Glen_bâ¦
3 hours ago
Thank you for these leads, @Glen_b!
â curious_dan
2 hours ago
I'm writing an answer based on my comment.
â Glen_bâ¦
2 hours ago
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I notice in statistics/machine learning methods, a distribution is often approximated by a Gaussian by computing the first two moments, using those to estimate $mu$ and $sigma^2$, and subsequently sample from that Gaussian.
It seems to me the more moments I calculate, the better I ought to be able to approximate the distribution I wish to sample.
What if I calculate 3 moments...how can I use those to sample from the distribution? And can this be extended to N moments?
probability sampling moments
New contributor
I notice in statistics/machine learning methods, a distribution is often approximated by a Gaussian by computing the first two moments, using those to estimate $mu$ and $sigma^2$, and subsequently sample from that Gaussian.
It seems to me the more moments I calculate, the better I ought to be able to approximate the distribution I wish to sample.
What if I calculate 3 moments...how can I use those to sample from the distribution? And can this be extended to N moments?
probability sampling moments
probability sampling moments
New contributor
New contributor
edited 2 hours ago
Ferdi
3,66542152
3,66542152
New contributor
asked 3 hours ago
curious_dan
163
163
New contributor
New contributor
Three moments don't determine a distributional form*; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions. $quad$ [* Indeed, sometimes even having all the moments isn't sufficient to determine a distribution.]
â Glen_bâ¦
3 hours ago
Thanks, @Glen_b ! I will read about the "method of moments" to understand when it's possible. Can you point me to theory that describes when the moments are not sufficient to determine the distribution?
â curious_dan
3 hours ago
"Method of moments" just tells you how to estimate parameters from moments. The remainder of your comment is a new question (I think it's already answered on site); briefly -- if the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (technically, you could in principle do an inverse Laplace transform). Certainly if some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0..
â Glen_bâ¦
3 hours ago
Thank you for these leads, @Glen_b!
â curious_dan
2 hours ago
I'm writing an answer based on my comment.
â Glen_bâ¦
2 hours ago
add a comment |Â
Three moments don't determine a distributional form*; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions. $quad$ [* Indeed, sometimes even having all the moments isn't sufficient to determine a distribution.]
â Glen_bâ¦
3 hours ago
Thanks, @Glen_b ! I will read about the "method of moments" to understand when it's possible. Can you point me to theory that describes when the moments are not sufficient to determine the distribution?
â curious_dan
3 hours ago
"Method of moments" just tells you how to estimate parameters from moments. The remainder of your comment is a new question (I think it's already answered on site); briefly -- if the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (technically, you could in principle do an inverse Laplace transform). Certainly if some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0..
â Glen_bâ¦
3 hours ago
Thank you for these leads, @Glen_b!
â curious_dan
2 hours ago
I'm writing an answer based on my comment.
â Glen_bâ¦
2 hours ago
Three moments don't determine a distributional form*; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions. $quad$ [* Indeed, sometimes even having all the moments isn't sufficient to determine a distribution.]
â Glen_bâ¦
3 hours ago
Three moments don't determine a distributional form*; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions. $quad$ [* Indeed, sometimes even having all the moments isn't sufficient to determine a distribution.]
â Glen_bâ¦
3 hours ago
Thanks, @Glen_b ! I will read about the "method of moments" to understand when it's possible. Can you point me to theory that describes when the moments are not sufficient to determine the distribution?
â curious_dan
3 hours ago
Thanks, @Glen_b ! I will read about the "method of moments" to understand when it's possible. Can you point me to theory that describes when the moments are not sufficient to determine the distribution?
â curious_dan
3 hours ago
"Method of moments" just tells you how to estimate parameters from moments. The remainder of your comment is a new question (I think it's already answered on site); briefly -- if the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (technically, you could in principle do an inverse Laplace transform). Certainly if some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0..
â Glen_bâ¦
3 hours ago
"Method of moments" just tells you how to estimate parameters from moments. The remainder of your comment is a new question (I think it's already answered on site); briefly -- if the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (technically, you could in principle do an inverse Laplace transform). Certainly if some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0..
â Glen_bâ¦
3 hours ago
Thank you for these leads, @Glen_b!
â curious_dan
2 hours ago
Thank you for these leads, @Glen_b!
â curious_dan
2 hours ago
I'm writing an answer based on my comment.
â Glen_bâ¦
2 hours ago
I'm writing an answer based on my comment.
â Glen_bâ¦
2 hours ago
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
Three moments don't determine a distributional form; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions.
Sometimes even having all the moments isn't sufficient to determine a distribution. If the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (you could in principle do an inverse Laplace transform to obtain it).
[If some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0.]
How the process would work for a specific choice of distribution
We can simplify the process of obtaining a distribution matching three moments by ignoring the mean and variance and working with a scaled third moment -- the moment-skewness ($gamma_1=mu_3/mu_2^3/2$).
We can do this because having selected a distribution with the relevant skewness, we can then back out the desired mean and variance by scaling and shifting.
Let's consider an example. Yesterday I created a large data set (which still happens to be in my R session) whose distribution I haven't tried to calculate the functional form of (it's a large set of values of the log of the sample variance of a Cauchy at n=10). We have the first three raw moments as
1.519, 3.597 and 11.479 respectively, or correspondingly a mean of 1.518, a standard deviation* of 1.136 and a skewness of 1.429 (so these are sample values from a large sample).
Formally, method of moments would attempt to match the raw moments, but the calculation is simpler if we start with the skewness (turning solving three equations in three unknowns into solving for one parameter at a time, a much simpler task).
* I am going to handwave away the distinction between using an n-denominator on the variance - as would correspond to formal method of moments - and an n-1 denominator and simply use sample calculations.
This skewness (~1.43) indicates we seek a distribution which is right-skew. I could choose, for example, a shifted lognormal distribution (three parameter lognormal, shape $sigma$, scale $mu$ and location-shift $gamma$) with the same moments. Let's begin by matching the skewness. The population skewness of a two parameter lognormal is:
$gamma_1=(e^sigma ^2!!+2)sqrt e^sigma ^2!!-1$
So let's start by equating that to the desired sample value to obtain an estimate of $sigma^2$, $tildesigma^2$, say.
Note that $gamma_1^2$ is $(tau+2)^2(tau-1)$ where $tau=e^sigma^2$. This then yields a simple cubic equation $tau^3+3tau^2-4=gamma_1^2$. Using the sample skewness in that equation yields $tildetauapprox 1.1995$ or $tildesigma^2approx 0.1819$.
We can then in turn solve for $mu$ by matching the variance (or standard deviation) and then for the location parameter by matching the mean.
But we could as easily have chosen a shifted-gamma or a shifted-Weibull distribution (or a shifted-F or any number of other choices) and run through essentially the same process. Each of them would be different.
(For the sample I was dealing with, a shifted gamma would probably have been a considerably better choice than a shifted lognormal, since the distribution of the logs of the values was left skew and the distribution of their cube root was very close to symmetric; these are consistent with what you see with unshifted gamma densities.)
One could even take the skewness-kurtosis diagram in a Pearson plot and draw a line at the desired skewness and thereby obtain a two-point distribution, sequence of beta distributions, a gamma distribution, a sequence of beta-prime distributions, an inverse-gamma disribution and a sequence of Pearson type IV distributions all with the same skewness.
We can see this illustrated in a skewness-kurtosis plot (Pearson plot) below (note that $beta_1=gamma_1^2$ and $beta_2$ is the kurtosis), with the regions for the various Pearson-distributions marked in.
The green horizontal line represents $gamma_1^2 = 2.042$, and we see it pass through each of the mentioned distribution-families, each corresponding to a different population kurtosis. (The dashed curve represents the lognormal, which is not a Pearson-family distribution)
add a comment |Â
up vote
1
down vote
So, the answer is generally NO, you can't do this, but sometimes you can.
When you can't
The reasons you can't do this usually are two folds.
First, if you have N observations, then at most you can calculates N moments. What about the other moments? You can't simply set them to zero.
Second, higher moments calculations become less and less precise, because you have to raise the numbers into higher powers. Consider 100th non-central moment, you can't usually calculate it with any precision: $$gamma_100=sum_ifracx_i^100 n$$
When you can
Now, sometimes you can get the distribution from moments. It's when you make an assumption about the distribution of some sort. For instance, you declare that it's normal. In this case all you need is just two moment, which can be calculated with decent precision, usually. Note, that normal distribution has higher moments, indeed, e.g. kurtosis, but we don't need them. If you were to calculate all moments of the normal distribution (without assuming it's normal), then tried to recover the characteristic function to sample from the distribution, it wouldn't work. However, when you forget about the higher moments and stick to the first two, it does work.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
Three moments don't determine a distributional form; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions.
Sometimes even having all the moments isn't sufficient to determine a distribution. If the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (you could in principle do an inverse Laplace transform to obtain it).
[If some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0.]
How the process would work for a specific choice of distribution
We can simplify the process of obtaining a distribution matching three moments by ignoring the mean and variance and working with a scaled third moment -- the moment-skewness ($gamma_1=mu_3/mu_2^3/2$).
We can do this because having selected a distribution with the relevant skewness, we can then back out the desired mean and variance by scaling and shifting.
Let's consider an example. Yesterday I created a large data set (which still happens to be in my R session) whose distribution I haven't tried to calculate the functional form of (it's a large set of values of the log of the sample variance of a Cauchy at n=10). We have the first three raw moments as
1.519, 3.597 and 11.479 respectively, or correspondingly a mean of 1.518, a standard deviation* of 1.136 and a skewness of 1.429 (so these are sample values from a large sample).
Formally, method of moments would attempt to match the raw moments, but the calculation is simpler if we start with the skewness (turning solving three equations in three unknowns into solving for one parameter at a time, a much simpler task).
* I am going to handwave away the distinction between using an n-denominator on the variance - as would correspond to formal method of moments - and an n-1 denominator and simply use sample calculations.
This skewness (~1.43) indicates we seek a distribution which is right-skew. I could choose, for example, a shifted lognormal distribution (three parameter lognormal, shape $sigma$, scale $mu$ and location-shift $gamma$) with the same moments. Let's begin by matching the skewness. The population skewness of a two parameter lognormal is:
$gamma_1=(e^sigma ^2!!+2)sqrt e^sigma ^2!!-1$
So let's start by equating that to the desired sample value to obtain an estimate of $sigma^2$, $tildesigma^2$, say.
Note that $gamma_1^2$ is $(tau+2)^2(tau-1)$ where $tau=e^sigma^2$. This then yields a simple cubic equation $tau^3+3tau^2-4=gamma_1^2$. Using the sample skewness in that equation yields $tildetauapprox 1.1995$ or $tildesigma^2approx 0.1819$.
We can then in turn solve for $mu$ by matching the variance (or standard deviation) and then for the location parameter by matching the mean.
But we could as easily have chosen a shifted-gamma or a shifted-Weibull distribution (or a shifted-F or any number of other choices) and run through essentially the same process. Each of them would be different.
(For the sample I was dealing with, a shifted gamma would probably have been a considerably better choice than a shifted lognormal, since the distribution of the logs of the values was left skew and the distribution of their cube root was very close to symmetric; these are consistent with what you see with unshifted gamma densities.)
One could even take the skewness-kurtosis diagram in a Pearson plot and draw a line at the desired skewness and thereby obtain a two-point distribution, sequence of beta distributions, a gamma distribution, a sequence of beta-prime distributions, an inverse-gamma disribution and a sequence of Pearson type IV distributions all with the same skewness.
We can see this illustrated in a skewness-kurtosis plot (Pearson plot) below (note that $beta_1=gamma_1^2$ and $beta_2$ is the kurtosis), with the regions for the various Pearson-distributions marked in.
The green horizontal line represents $gamma_1^2 = 2.042$, and we see it pass through each of the mentioned distribution-families, each corresponding to a different population kurtosis. (The dashed curve represents the lognormal, which is not a Pearson-family distribution)
add a comment |Â
up vote
3
down vote
Three moments don't determine a distributional form; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions.
Sometimes even having all the moments isn't sufficient to determine a distribution. If the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (you could in principle do an inverse Laplace transform to obtain it).
[If some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0.]
How the process would work for a specific choice of distribution
We can simplify the process of obtaining a distribution matching three moments by ignoring the mean and variance and working with a scaled third moment -- the moment-skewness ($gamma_1=mu_3/mu_2^3/2$).
We can do this because having selected a distribution with the relevant skewness, we can then back out the desired mean and variance by scaling and shifting.
Let's consider an example. Yesterday I created a large data set (which still happens to be in my R session) whose distribution I haven't tried to calculate the functional form of (it's a large set of values of the log of the sample variance of a Cauchy at n=10). We have the first three raw moments as
1.519, 3.597 and 11.479 respectively, or correspondingly a mean of 1.518, a standard deviation* of 1.136 and a skewness of 1.429 (so these are sample values from a large sample).
Formally, method of moments would attempt to match the raw moments, but the calculation is simpler if we start with the skewness (turning solving three equations in three unknowns into solving for one parameter at a time, a much simpler task).
* I am going to handwave away the distinction between using an n-denominator on the variance - as would correspond to formal method of moments - and an n-1 denominator and simply use sample calculations.
This skewness (~1.43) indicates we seek a distribution which is right-skew. I could choose, for example, a shifted lognormal distribution (three parameter lognormal, shape $sigma$, scale $mu$ and location-shift $gamma$) with the same moments. Let's begin by matching the skewness. The population skewness of a two parameter lognormal is:
$gamma_1=(e^sigma ^2!!+2)sqrt e^sigma ^2!!-1$
So let's start by equating that to the desired sample value to obtain an estimate of $sigma^2$, $tildesigma^2$, say.
Note that $gamma_1^2$ is $(tau+2)^2(tau-1)$ where $tau=e^sigma^2$. This then yields a simple cubic equation $tau^3+3tau^2-4=gamma_1^2$. Using the sample skewness in that equation yields $tildetauapprox 1.1995$ or $tildesigma^2approx 0.1819$.
We can then in turn solve for $mu$ by matching the variance (or standard deviation) and then for the location parameter by matching the mean.
But we could as easily have chosen a shifted-gamma or a shifted-Weibull distribution (or a shifted-F or any number of other choices) and run through essentially the same process. Each of them would be different.
(For the sample I was dealing with, a shifted gamma would probably have been a considerably better choice than a shifted lognormal, since the distribution of the logs of the values was left skew and the distribution of their cube root was very close to symmetric; these are consistent with what you see with unshifted gamma densities.)
One could even take the skewness-kurtosis diagram in a Pearson plot and draw a line at the desired skewness and thereby obtain a two-point distribution, sequence of beta distributions, a gamma distribution, a sequence of beta-prime distributions, an inverse-gamma disribution and a sequence of Pearson type IV distributions all with the same skewness.
We can see this illustrated in a skewness-kurtosis plot (Pearson plot) below (note that $beta_1=gamma_1^2$ and $beta_2$ is the kurtosis), with the regions for the various Pearson-distributions marked in.
The green horizontal line represents $gamma_1^2 = 2.042$, and we see it pass through each of the mentioned distribution-families, each corresponding to a different population kurtosis. (The dashed curve represents the lognormal, which is not a Pearson-family distribution)
add a comment |Â
up vote
3
down vote
up vote
3
down vote
Three moments don't determine a distributional form; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions.
Sometimes even having all the moments isn't sufficient to determine a distribution. If the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (you could in principle do an inverse Laplace transform to obtain it).
[If some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0.]
How the process would work for a specific choice of distribution
We can simplify the process of obtaining a distribution matching three moments by ignoring the mean and variance and working with a scaled third moment -- the moment-skewness ($gamma_1=mu_3/mu_2^3/2$).
We can do this because having selected a distribution with the relevant skewness, we can then back out the desired mean and variance by scaling and shifting.
Let's consider an example. Yesterday I created a large data set (which still happens to be in my R session) whose distribution I haven't tried to calculate the functional form of (it's a large set of values of the log of the sample variance of a Cauchy at n=10). We have the first three raw moments as
1.519, 3.597 and 11.479 respectively, or correspondingly a mean of 1.518, a standard deviation* of 1.136 and a skewness of 1.429 (so these are sample values from a large sample).
Formally, method of moments would attempt to match the raw moments, but the calculation is simpler if we start with the skewness (turning solving three equations in three unknowns into solving for one parameter at a time, a much simpler task).
* I am going to handwave away the distinction between using an n-denominator on the variance - as would correspond to formal method of moments - and an n-1 denominator and simply use sample calculations.
This skewness (~1.43) indicates we seek a distribution which is right-skew. I could choose, for example, a shifted lognormal distribution (three parameter lognormal, shape $sigma$, scale $mu$ and location-shift $gamma$) with the same moments. Let's begin by matching the skewness. The population skewness of a two parameter lognormal is:
$gamma_1=(e^sigma ^2!!+2)sqrt e^sigma ^2!!-1$
So let's start by equating that to the desired sample value to obtain an estimate of $sigma^2$, $tildesigma^2$, say.
Note that $gamma_1^2$ is $(tau+2)^2(tau-1)$ where $tau=e^sigma^2$. This then yields a simple cubic equation $tau^3+3tau^2-4=gamma_1^2$. Using the sample skewness in that equation yields $tildetauapprox 1.1995$ or $tildesigma^2approx 0.1819$.
We can then in turn solve for $mu$ by matching the variance (or standard deviation) and then for the location parameter by matching the mean.
But we could as easily have chosen a shifted-gamma or a shifted-Weibull distribution (or a shifted-F or any number of other choices) and run through essentially the same process. Each of them would be different.
(For the sample I was dealing with, a shifted gamma would probably have been a considerably better choice than a shifted lognormal, since the distribution of the logs of the values was left skew and the distribution of their cube root was very close to symmetric; these are consistent with what you see with unshifted gamma densities.)
One could even take the skewness-kurtosis diagram in a Pearson plot and draw a line at the desired skewness and thereby obtain a two-point distribution, sequence of beta distributions, a gamma distribution, a sequence of beta-prime distributions, an inverse-gamma disribution and a sequence of Pearson type IV distributions all with the same skewness.
We can see this illustrated in a skewness-kurtosis plot (Pearson plot) below (note that $beta_1=gamma_1^2$ and $beta_2$ is the kurtosis), with the regions for the various Pearson-distributions marked in.
The green horizontal line represents $gamma_1^2 = 2.042$, and we see it pass through each of the mentioned distribution-families, each corresponding to a different population kurtosis. (The dashed curve represents the lognormal, which is not a Pearson-family distribution)
Three moments don't determine a distributional form; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions.
Sometimes even having all the moments isn't sufficient to determine a distribution. If the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (you could in principle do an inverse Laplace transform to obtain it).
[If some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0.]
How the process would work for a specific choice of distribution
We can simplify the process of obtaining a distribution matching three moments by ignoring the mean and variance and working with a scaled third moment -- the moment-skewness ($gamma_1=mu_3/mu_2^3/2$).
We can do this because having selected a distribution with the relevant skewness, we can then back out the desired mean and variance by scaling and shifting.
Let's consider an example. Yesterday I created a large data set (which still happens to be in my R session) whose distribution I haven't tried to calculate the functional form of (it's a large set of values of the log of the sample variance of a Cauchy at n=10). We have the first three raw moments as
1.519, 3.597 and 11.479 respectively, or correspondingly a mean of 1.518, a standard deviation* of 1.136 and a skewness of 1.429 (so these are sample values from a large sample).
Formally, method of moments would attempt to match the raw moments, but the calculation is simpler if we start with the skewness (turning solving three equations in three unknowns into solving for one parameter at a time, a much simpler task).
* I am going to handwave away the distinction between using an n-denominator on the variance - as would correspond to formal method of moments - and an n-1 denominator and simply use sample calculations.
This skewness (~1.43) indicates we seek a distribution which is right-skew. I could choose, for example, a shifted lognormal distribution (three parameter lognormal, shape $sigma$, scale $mu$ and location-shift $gamma$) with the same moments. Let's begin by matching the skewness. The population skewness of a two parameter lognormal is:
$gamma_1=(e^sigma ^2!!+2)sqrt e^sigma ^2!!-1$
So let's start by equating that to the desired sample value to obtain an estimate of $sigma^2$, $tildesigma^2$, say.
Note that $gamma_1^2$ is $(tau+2)^2(tau-1)$ where $tau=e^sigma^2$. This then yields a simple cubic equation $tau^3+3tau^2-4=gamma_1^2$. Using the sample skewness in that equation yields $tildetauapprox 1.1995$ or $tildesigma^2approx 0.1819$.
We can then in turn solve for $mu$ by matching the variance (or standard deviation) and then for the location parameter by matching the mean.
But we could as easily have chosen a shifted-gamma or a shifted-Weibull distribution (or a shifted-F or any number of other choices) and run through essentially the same process. Each of them would be different.
(For the sample I was dealing with, a shifted gamma would probably have been a considerably better choice than a shifted lognormal, since the distribution of the logs of the values was left skew and the distribution of their cube root was very close to symmetric; these are consistent with what you see with unshifted gamma densities.)
One could even take the skewness-kurtosis diagram in a Pearson plot and draw a line at the desired skewness and thereby obtain a two-point distribution, sequence of beta distributions, a gamma distribution, a sequence of beta-prime distributions, an inverse-gamma disribution and a sequence of Pearson type IV distributions all with the same skewness.
We can see this illustrated in a skewness-kurtosis plot (Pearson plot) below (note that $beta_1=gamma_1^2$ and $beta_2$ is the kurtosis), with the regions for the various Pearson-distributions marked in.
The green horizontal line represents $gamma_1^2 = 2.042$, and we see it pass through each of the mentioned distribution-families, each corresponding to a different population kurtosis. (The dashed curve represents the lognormal, which is not a Pearson-family distribution)
edited 1 hour ago
answered 2 hours ago
Glen_bâ¦
205k22391718
205k22391718
add a comment |Â
add a comment |Â
up vote
1
down vote
So, the answer is generally NO, you can't do this, but sometimes you can.
When you can't
The reasons you can't do this usually are two folds.
First, if you have N observations, then at most you can calculates N moments. What about the other moments? You can't simply set them to zero.
Second, higher moments calculations become less and less precise, because you have to raise the numbers into higher powers. Consider 100th non-central moment, you can't usually calculate it with any precision: $$gamma_100=sum_ifracx_i^100 n$$
When you can
Now, sometimes you can get the distribution from moments. It's when you make an assumption about the distribution of some sort. For instance, you declare that it's normal. In this case all you need is just two moment, which can be calculated with decent precision, usually. Note, that normal distribution has higher moments, indeed, e.g. kurtosis, but we don't need them. If you were to calculate all moments of the normal distribution (without assuming it's normal), then tried to recover the characteristic function to sample from the distribution, it wouldn't work. However, when you forget about the higher moments and stick to the first two, it does work.
add a comment |Â
up vote
1
down vote
So, the answer is generally NO, you can't do this, but sometimes you can.
When you can't
The reasons you can't do this usually are two folds.
First, if you have N observations, then at most you can calculates N moments. What about the other moments? You can't simply set them to zero.
Second, higher moments calculations become less and less precise, because you have to raise the numbers into higher powers. Consider 100th non-central moment, you can't usually calculate it with any precision: $$gamma_100=sum_ifracx_i^100 n$$
When you can
Now, sometimes you can get the distribution from moments. It's when you make an assumption about the distribution of some sort. For instance, you declare that it's normal. In this case all you need is just two moment, which can be calculated with decent precision, usually. Note, that normal distribution has higher moments, indeed, e.g. kurtosis, but we don't need them. If you were to calculate all moments of the normal distribution (without assuming it's normal), then tried to recover the characteristic function to sample from the distribution, it wouldn't work. However, when you forget about the higher moments and stick to the first two, it does work.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
So, the answer is generally NO, you can't do this, but sometimes you can.
When you can't
The reasons you can't do this usually are two folds.
First, if you have N observations, then at most you can calculates N moments. What about the other moments? You can't simply set them to zero.
Second, higher moments calculations become less and less precise, because you have to raise the numbers into higher powers. Consider 100th non-central moment, you can't usually calculate it with any precision: $$gamma_100=sum_ifracx_i^100 n$$
When you can
Now, sometimes you can get the distribution from moments. It's when you make an assumption about the distribution of some sort. For instance, you declare that it's normal. In this case all you need is just two moment, which can be calculated with decent precision, usually. Note, that normal distribution has higher moments, indeed, e.g. kurtosis, but we don't need them. If you were to calculate all moments of the normal distribution (without assuming it's normal), then tried to recover the characteristic function to sample from the distribution, it wouldn't work. However, when you forget about the higher moments and stick to the first two, it does work.
So, the answer is generally NO, you can't do this, but sometimes you can.
When you can't
The reasons you can't do this usually are two folds.
First, if you have N observations, then at most you can calculates N moments. What about the other moments? You can't simply set them to zero.
Second, higher moments calculations become less and less precise, because you have to raise the numbers into higher powers. Consider 100th non-central moment, you can't usually calculate it with any precision: $$gamma_100=sum_ifracx_i^100 n$$
When you can
Now, sometimes you can get the distribution from moments. It's when you make an assumption about the distribution of some sort. For instance, you declare that it's normal. In this case all you need is just two moment, which can be calculated with decent precision, usually. Note, that normal distribution has higher moments, indeed, e.g. kurtosis, but we don't need them. If you were to calculate all moments of the normal distribution (without assuming it's normal), then tried to recover the characteristic function to sample from the distribution, it wouldn't work. However, when you forget about the higher moments and stick to the first two, it does work.
answered 1 hour ago
Aksakal
37.2k447108
37.2k447108
add a comment |Â
add a comment |Â
curious_dan is a new contributor. Be nice, and check out our Code of Conduct.
curious_dan is a new contributor. Be nice, and check out our Code of Conduct.
curious_dan is a new contributor. Be nice, and check out our Code of Conduct.
curious_dan is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f375515%2fcan-i-use-moments-of-a-distribution-to-sample-the-distribution%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Three moments don't determine a distributional form*; if you choose a distribution-famiy with three parameters which relate to the first three population moments, you can do moment matching ("method of moments") to estimate the three parameters and then generate values from such a distribution. There are many such distributions. $quad$ [* Indeed, sometimes even having all the moments isn't sufficient to determine a distribution.]
â Glen_bâ¦
3 hours ago
Thanks, @Glen_b ! I will read about the "method of moments" to understand when it's possible. Can you point me to theory that describes when the moments are not sufficient to determine the distribution?
â curious_dan
3 hours ago
"Method of moments" just tells you how to estimate parameters from moments. The remainder of your comment is a new question (I think it's already answered on site); briefly -- if the moment generating function exists (in a neighborhood of 0) then it uniquely identifies a distribution (technically, you could in principle do an inverse Laplace transform). Certainly if some moments are not finite this would mean the mgf doesn't exist, but there are also cases where all moments are finite but the mgf still doesn't exist in a neighborhood of 0..
â Glen_bâ¦
3 hours ago
Thank you for these leads, @Glen_b!
â curious_dan
2 hours ago
I'm writing an answer based on my comment.
â Glen_bâ¦
2 hours ago