Confusion in modelling finite mixture model

From the book "Machine Learning a probabilistic Perspective",
I'm reading about finite/infinite mixture models. Particularly at paragraph 25.2.1 it's stated:

The usual representation (of a finite mixture model) is as follows:

$p(x_i|z_i = k, boldsymboltheta) = p(x_i|boldsymboltheta_k)$

$p(z_i = k| boldsymbolpi = pi_k) = pi_k$

$p(boldsymbolpi|alpha) =textDir(boldsymbolpi|(alpha/K)boldsymbol1_K)$

The form of $p(boldsymboltheta_k|lambda)$ is chosen to be be
conjugate to $p(x_i|boldsymboltheta_k)$. We can write
$p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution.
Similarly, we can write $boldsymboltheta_k sim H(lambda)$, where H
is the prior.

Now this modelling is quite confusing to me. What is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? What is meant by "Observation distribution"? Can we apply EM algorithm to this model, how?

edited Jan 10 at 12:14

Xi'an

55.1k792353

asked Jan 10 at 10:36

Tommaso Bendinelli

265

add a comment |

From the book "Machine Learning a probabilistic Perspective",
I'm reading about finite/infinite mixture models. Particularly at paragraph 25.2.1 it's stated:

The usual representation (of a finite mixture model) is as follows:

$p(x_i|z_i = k, boldsymboltheta) = p(x_i|boldsymboltheta_k)$

$p(z_i = k| boldsymbolpi = pi_k) = pi_k$

$p(boldsymbolpi|alpha) =textDir(boldsymbolpi|(alpha/K)boldsymbol1_K)$

The form of $p(boldsymboltheta_k|lambda)$ is chosen to be be
conjugate to $p(x_i|boldsymboltheta_k)$. We can write
$p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution.
Similarly, we can write $boldsymboltheta_k sim H(lambda)$, where H
is the prior.

edited Jan 10 at 12:14

Xi'an

55.1k792353

asked Jan 10 at 10:36

Tommaso Bendinelli

265

add a comment |

From the book "Machine Learning a probabilistic Perspective",
I'm reading about finite/infinite mixture models. Particularly at paragraph 25.2.1 it's stated:

The usual representation (of a finite mixture model) is as follows:

$p(x_i|z_i = k, boldsymboltheta) = p(x_i|boldsymboltheta_k)$

$p(z_i = k| boldsymbolpi = pi_k) = pi_k$

$p(boldsymbolpi|alpha) =textDir(boldsymbolpi|(alpha/K)boldsymbol1_K)$

The form of $p(boldsymboltheta_k|lambda)$ is chosen to be be
conjugate to $p(x_i|boldsymboltheta_k)$. We can write
$p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution.
Similarly, we can write $boldsymboltheta_k sim H(lambda)$, where H
is the prior.

edited Jan 10 at 12:14

Xi'an

55.1k792353

asked Jan 10 at 10:36

Tommaso Bendinelli

265

From the book "Machine Learning a probabilistic Perspective",
I'm reading about finite/infinite mixture models. Particularly at paragraph 25.2.1 it's stated:

The usual representation (of a finite mixture model) is as follows:

$p(x_i|z_i = k, boldsymboltheta) = p(x_i|boldsymboltheta_k)$

$p(z_i = k| boldsymbolpi = pi_k) = pi_k$

$p(boldsymbolpi|alpha) =textDir(boldsymbolpi|(alpha/K)boldsymbol1_K)$

The form of $p(boldsymboltheta_k|lambda)$ is chosen to be be
conjugate to $p(x_i|boldsymboltheta_k)$. We can write
$p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution.
Similarly, we can write $boldsymboltheta_k sim H(lambda)$, where H
is the prior.

machine-learning probability unsupervised-learning mixture finite-mixture-model

edited Jan 10 at 12:14

Xi'an

55.1k792353

asked Jan 10 at 10:36

Tommaso Bendinelli

265

edited Jan 10 at 12:14

Xi'an

55.1k792353

asked Jan 10 at 10:36

Tommaso Bendinelli

265

edited Jan 10 at 12:14

Xi'an

55.1k792353

edited Jan 10 at 12:14

Xi'an

55.1k792353

edited Jan 10 at 12:14

Xi'an

55.1k792353

asked Jan 10 at 10:36

Tommaso Bendinelli

265

asked Jan 10 at 10:36

Tommaso Bendinelli

265

asked Jan 10 at 10:36

Tommaso Bendinelli

265

add a comment |

2 Answers
2

active

oldest

votes

If you consider a mixture model,$$Xsimsum_k pi_k p(x|theta_k)$$it can be expressed as the marginal of the joint model$$(X,Z)sim underbracepi_z_p_pi(z)timesunderbracetheta_z)_p_theta(x$$where $Z$ is an integer valued random variable. This implies that$$X|Z=ksim p(x|theta_k)$$which can also be written as $$X|Zsim p(x|theta_Z)$$The random variable $Z$ is also latent in that (a) it is not observed and (b) it does not necessarily "exist" in the original experiment modelled by the mixture. But the EM algorithm that returns the MLE of the parameters $pi_k$ and $theta_k$ takes advantage of this joint representation by iteratively

calculating $mathbbE[Z_i|X_i,theta,pi]$

maximising the expected log-likelihood in $(theta,pi)$

edited Jan 10 at 12:35

answered Jan 10 at 12:19

Xi'an

55.1k792353

$begingroup$
Thank you! And what is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? And what is meant by "We can write > $p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution"?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:23

$begingroup$
If $ X|Z=ksim p(x|theta_k)$ and $X|Zsim p(x|theta_Z)$ are identical, because both represent the random varialbe $X|Z$ why in one case $theta$ is $theta_k$ and in the other $theta_z$ ?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:42

$begingroup$
I don't get both your point, (i) where have you seen $X|Z = Z|X$, (ii) What you "by is indexed".
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:49

$begingroup$
Also do we know the distribution $F(theta_z_i)$?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:54

add a comment |

What is the difference between $theta_k$ and $theta_z_i$?

The only difference is the subscript. The value $theta_z_i$ refers to the value $theta_k$ where $k = z_i$. So the bit which says $boldsymbolx sim F(boldsymboltheta_z_i)$ simply means that $boldsymbolx sim F(boldsymboltheta_k)$ where the subscript $k$ is the latent random variable $z_i$.

answered Jan 10 at 13:20

Ben

23.4k224113

$begingroup$
Thanks for answering, do we know the distribution $(F(boldsymboltheta_k))$ or is it unknown? Intuitively I would think that we don't know it, because if we do there is no point in modeling a Gaussian mixture right?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:29

$begingroup$
It is not specified in the problem, since this is a general description of the finite mixture model, which can take on all sorts of distributions. Usually it would be a distribution with an assumed particular parametric form, but with unknown parameters.
$endgroup$
– Ben
Jan 10 at 13:31

$begingroup$
Okey, so just to be sure I understood. $theta_z_i$ are random variable, because $z_i$ is random variable. While $theta_k$, once sampled from H it's a value. Correct?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:33

$begingroup$
Yes, but in a mixture model you will usually only observe the $x$s. The rest are unobserved "latent variables" which are effectively parameter values that you infer via Bayes theorem.
$endgroup$
– Ben
Jan 10 at 13:36

$begingroup$
Please have a look here stats.stackexchange.com/questions/386681/…
$endgroup$
– Tommaso Bendinelli
Jan 11 at 9:48

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f386513%2fconfusion-in-modelling-finite-mixture-model%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

calculating $mathbbE[Z_i|X_i,theta,pi]$

maximising the expected log-likelihood in $(theta,pi)$

edited Jan 10 at 12:35

answered Jan 10 at 12:19

Xi'an

55.1k792353

$begingroup$
Thank you! And what is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? And what is meant by "We can write > $p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution"?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:23

$begingroup$
If $ X|Z=ksim p(x|theta_k)$ and $X|Zsim p(x|theta_Z)$ are identical, because both represent the random varialbe $X|Z$ why in one case $theta$ is $theta_k$ and in the other $theta_z$ ?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:42

$begingroup$
I don't get both your point, (i) where have you seen $X|Z = Z|X$, (ii) What you "by is indexed".
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:49

$begingroup$
Also do we know the distribution $F(theta_z_i)$?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:54

add a comment |

calculating $mathbbE[Z_i|X_i,theta,pi]$

maximising the expected log-likelihood in $(theta,pi)$

edited Jan 10 at 12:35

answered Jan 10 at 12:19

Xi'an

55.1k792353

$begingroup$
Thank you! And what is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? And what is meant by "We can write > $p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution"?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:23

$begingroup$
If $ X|Z=ksim p(x|theta_k)$ and $X|Zsim p(x|theta_Z)$ are identical, because both represent the random varialbe $X|Z$ why in one case $theta$ is $theta_k$ and in the other $theta_z$ ?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:42

$begingroup$
I don't get both your point, (i) where have you seen $X|Z = Z|X$, (ii) What you "by is indexed".
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:49

$begingroup$
Also do we know the distribution $F(theta_z_i)$?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:54

add a comment |

calculating $mathbbE[Z_i|X_i,theta,pi]$

maximising the expected log-likelihood in $(theta,pi)$

edited Jan 10 at 12:35

answered Jan 10 at 12:19

Xi'an

55.1k792353

calculating $mathbbE[Z_i|X_i,theta,pi]$

maximising the expected log-likelihood in $(theta,pi)$

edited Jan 10 at 12:35

answered Jan 10 at 12:19

Xi'an

55.1k792353

edited Jan 10 at 12:35

answered Jan 10 at 12:19

Xi'an

55.1k792353

answered Jan 10 at 12:19

Xi'an

55.1k792353

answered Jan 10 at 12:19

Xi'an

55.1k792353

$begingroup$
Thank you! And what is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? And what is meant by "We can write > $p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution"?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:23

$begingroup$
If $ X|Z=ksim p(x|theta_k)$ and $X|Zsim p(x|theta_Z)$ are identical, because both represent the random varialbe $X|Z$ why in one case $theta$ is $theta_k$ and in the other $theta_z$ ?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:42

$begingroup$
I don't get both your point, (i) where have you seen $X|Z = Z|X$, (ii) What you "by is indexed".
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:49

$begingroup$
Also do we know the distribution $F(theta_z_i)$?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:54

add a comment |

$begingroup$
Thank you! And what is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? And what is meant by "We can write > $p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution"?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:23

$begingroup$
If $ X|Z=ksim p(x|theta_k)$ and $X|Zsim p(x|theta_Z)$ are identical, because both represent the random varialbe $X|Z$ why in one case $theta$ is $theta_k$ and in the other $theta_z$ ?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:42

$begingroup$
I don't get both your point, (i) where have you seen $X|Z = Z|X$, (ii) What you "by is indexed".
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:49

$begingroup$
Also do we know the distribution $F(theta_z_i)$?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 12:54

Thank you! And what is the difference between $boldsymboltheta_k$ and $boldsymboltheta_z_i$? And what is meant by "We can write > $p(x_i|boldsymboltheta_k)$ as $boldsymbolx_i sim F(boldsymboltheta_z_i)$ where F is the observation distribution"?

– Tommaso Bendinelli
Jan 10 at 12:23

I don't get both your point, (i) where have you seen $X|Z = Z|X$, (ii) What you "by is indexed".

– Tommaso Bendinelli
Jan 10 at 12:49

Also do we know the distribution $F(theta_z_i)$?

– Tommaso Bendinelli
Jan 10 at 12:54

add a comment |

What is the difference between $theta_k$ and $theta_z_i$?

answered Jan 10 at 13:20

Ben

23.4k224113

$begingroup$
Thanks for answering, do we know the distribution $(F(boldsymboltheta_k))$ or is it unknown? Intuitively I would think that we don't know it, because if we do there is no point in modeling a Gaussian mixture right?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:29

$begingroup$
It is not specified in the problem, since this is a general description of the finite mixture model, which can take on all sorts of distributions. Usually it would be a distribution with an assumed particular parametric form, but with unknown parameters.
$endgroup$
– Ben
Jan 10 at 13:31

$begingroup$
Okey, so just to be sure I understood. $theta_z_i$ are random variable, because $z_i$ is random variable. While $theta_k$, once sampled from H it's a value. Correct?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:33

$begingroup$
Yes, but in a mixture model you will usually only observe the $x$s. The rest are unobserved "latent variables" which are effectively parameter values that you infer via Bayes theorem.
$endgroup$
– Ben
Jan 10 at 13:36

$begingroup$
Please have a look here stats.stackexchange.com/questions/386681/…
$endgroup$
– Tommaso Bendinelli
Jan 11 at 9:48

add a comment |

What is the difference between $theta_k$ and $theta_z_i$?

answered Jan 10 at 13:20

Ben

23.4k224113

$begingroup$
Thanks for answering, do we know the distribution $(F(boldsymboltheta_k))$ or is it unknown? Intuitively I would think that we don't know it, because if we do there is no point in modeling a Gaussian mixture right?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:29

$begingroup$
It is not specified in the problem, since this is a general description of the finite mixture model, which can take on all sorts of distributions. Usually it would be a distribution with an assumed particular parametric form, but with unknown parameters.
$endgroup$
– Ben
Jan 10 at 13:31

$begingroup$
Okey, so just to be sure I understood. $theta_z_i$ are random variable, because $z_i$ is random variable. While $theta_k$, once sampled from H it's a value. Correct?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:33

$begingroup$
Yes, but in a mixture model you will usually only observe the $x$s. The rest are unobserved "latent variables" which are effectively parameter values that you infer via Bayes theorem.
$endgroup$
– Ben
Jan 10 at 13:36

$begingroup$
Please have a look here stats.stackexchange.com/questions/386681/…
$endgroup$
– Tommaso Bendinelli
Jan 11 at 9:48

add a comment |

What is the difference between $theta_k$ and $theta_z_i$?

answered Jan 10 at 13:20

Ben

23.4k224113

What is the difference between $theta_k$ and $theta_z_i$?

answered Jan 10 at 13:20

Ben

23.4k224113

answered Jan 10 at 13:20

Ben

23.4k224113

answered Jan 10 at 13:20

Ben

23.4k224113

answered Jan 10 at 13:20

Ben

23.4k224113

$begingroup$
Thanks for answering, do we know the distribution $(F(boldsymboltheta_k))$ or is it unknown? Intuitively I would think that we don't know it, because if we do there is no point in modeling a Gaussian mixture right?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:29

$begingroup$
It is not specified in the problem, since this is a general description of the finite mixture model, which can take on all sorts of distributions. Usually it would be a distribution with an assumed particular parametric form, but with unknown parameters.
$endgroup$
– Ben
Jan 10 at 13:31

$begingroup$
Okey, so just to be sure I understood. $theta_z_i$ are random variable, because $z_i$ is random variable. While $theta_k$, once sampled from H it's a value. Correct?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:33

$begingroup$
Yes, but in a mixture model you will usually only observe the $x$s. The rest are unobserved "latent variables" which are effectively parameter values that you infer via Bayes theorem.
$endgroup$
– Ben
Jan 10 at 13:36

$begingroup$
Please have a look here stats.stackexchange.com/questions/386681/…
$endgroup$
– Tommaso Bendinelli
Jan 11 at 9:48

add a comment |

$begingroup$
Thanks for answering, do we know the distribution $(F(boldsymboltheta_k))$ or is it unknown? Intuitively I would think that we don't know it, because if we do there is no point in modeling a Gaussian mixture right?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:29

$begingroup$
It is not specified in the problem, since this is a general description of the finite mixture model, which can take on all sorts of distributions. Usually it would be a distribution with an assumed particular parametric form, but with unknown parameters.
$endgroup$
– Ben
Jan 10 at 13:31

$begingroup$
Okey, so just to be sure I understood. $theta_z_i$ are random variable, because $z_i$ is random variable. While $theta_k$, once sampled from H it's a value. Correct?
$endgroup$
– Tommaso Bendinelli
Jan 10 at 13:33

$begingroup$
Yes, but in a mixture model you will usually only observe the $x$s. The rest are unobserved "latent variables" which are effectively parameter values that you infer via Bayes theorem.
$endgroup$
– Ben
Jan 10 at 13:36

$begingroup$
Please have a look here stats.stackexchange.com/questions/386681/…
$endgroup$
– Tommaso Bendinelli
Jan 11 at 9:48

Thanks for answering, do we know the distribution $(F(boldsymboltheta_k))$ or is it unknown? Intuitively I would think that we don't know it, because if we do there is no point in modeling a Gaussian mixture right?

– Tommaso Bendinelli
Jan 10 at 13:29

It is not specified in the problem, since this is a general description of the finite mixture model, which can take on all sorts of distributions. Usually it would be a distribution with an assumed particular parametric form, but with unknown parameters.

– Ben
Jan 10 at 13:31

Okey, so just to be sure I understood. $theta_z_i$ are random variable, because $z_i$ is random variable. While $theta_k$, once sampled from H it's a value. Correct?

– Tommaso Bendinelli
Jan 10 at 13:33

Yes, but in a mixture model you will usually only observe the $x$s. The rest are unobserved "latent variables" which are effectively parameter values that you infer via Bayes theorem.

– Ben
Jan 10 at 13:36

Please have a look here stats.stackexchange.com/questions/386681/…

– Tommaso Bendinelli
Jan 11 at 9:48

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu