What's a mean field variational family?
Clash Royale CLAN TAG#URR8PPP
$begingroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ j=1^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition?
probability computational-statistics variational-bayes
$endgroup$
add a comment |
$begingroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ j=1^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition?
probability computational-statistics variational-bayes
$endgroup$
add a comment |
$begingroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ j=1^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition?
probability computational-statistics variational-bayes
$endgroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ j=1^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition?
probability computational-statistics variational-bayes
probability computational-statistics variational-bayes
edited Feb 11 at 8:13
Ferdi
3,78142152
3,78142152
asked Feb 10 at 19:46
Lodore66Lodore66
1183
1183
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_m-1)$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_i=1^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
$endgroup$
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_m-1)$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_i=1^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
$endgroup$
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
add a comment |
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_m-1)$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_i=1^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
$endgroup$
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
add a comment |
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_m-1)$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_i=1^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
$endgroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_m-1)$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_i=1^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
answered Feb 10 at 20:06
snickerdoodles777snickerdoodles777
1188
1188
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
add a comment |
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
Feb 10 at 21:16
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown