Based on the ideas of Parameter Estimation and Fitting Probability Distributions, what stops us from making any function be a PDF(PMF)?
Clash Royale CLAN TAG#URR8PPP
$begingroup$
Currently I am doing an introduction to parameter estimation and fitting probability distributions to sets of data. So in a small synopsis what I understand the whole process to be like is the following:
1) We collect a large amount of raw data, which comes from an underlying probability distribution. We then "graph" the data (perhaps in the form of a bar chart or something similar at least in the 2D and 3D cases).
2) Observing this visual presentation we go through our list of existing probability distributions and form an opinion on which distribution appears to fit the data most precisely.
3) We then take a large sample from this data and attempt to estimate the parameters of our chosen probability distribution by using the array of techinques available at our disposal.
I have a few questions:
i) Is the outline above the procedure used to get parameter estimates?
ii) (more important) What stops us from making any function a probability distribution? What I mean is we have this visual representation of the data, perhaps none of the known probability distributions that we have presently align with the data. What stops us from just saying "this continuous function will now be a distribution as long as it satisfies the necessary axioms." Is there something more rigourous to this? (perhpas I just haven't arrived there yet in my studies).
distributions fitting estimators theory
$endgroup$
add a comment |
$begingroup$
Currently I am doing an introduction to parameter estimation and fitting probability distributions to sets of data. So in a small synopsis what I understand the whole process to be like is the following:
1) We collect a large amount of raw data, which comes from an underlying probability distribution. We then "graph" the data (perhaps in the form of a bar chart or something similar at least in the 2D and 3D cases).
2) Observing this visual presentation we go through our list of existing probability distributions and form an opinion on which distribution appears to fit the data most precisely.
3) We then take a large sample from this data and attempt to estimate the parameters of our chosen probability distribution by using the array of techinques available at our disposal.
I have a few questions:
i) Is the outline above the procedure used to get parameter estimates?
ii) (more important) What stops us from making any function a probability distribution? What I mean is we have this visual representation of the data, perhaps none of the known probability distributions that we have presently align with the data. What stops us from just saying "this continuous function will now be a distribution as long as it satisfies the necessary axioms." Is there something more rigourous to this? (perhpas I just haven't arrived there yet in my studies).
distributions fitting estimators theory
$endgroup$
1
$begingroup$
To your last question: A probability distribution do not need to be any of the "named" distributions. Any nonnegative function that integrates to 1 can be used as a probability density. Maybe you should look for nonparametric methods.
$endgroup$
– kjetil b halvorsen
Feb 3 at 21:36
add a comment |
$begingroup$
Currently I am doing an introduction to parameter estimation and fitting probability distributions to sets of data. So in a small synopsis what I understand the whole process to be like is the following:
1) We collect a large amount of raw data, which comes from an underlying probability distribution. We then "graph" the data (perhaps in the form of a bar chart or something similar at least in the 2D and 3D cases).
2) Observing this visual presentation we go through our list of existing probability distributions and form an opinion on which distribution appears to fit the data most precisely.
3) We then take a large sample from this data and attempt to estimate the parameters of our chosen probability distribution by using the array of techinques available at our disposal.
I have a few questions:
i) Is the outline above the procedure used to get parameter estimates?
ii) (more important) What stops us from making any function a probability distribution? What I mean is we have this visual representation of the data, perhaps none of the known probability distributions that we have presently align with the data. What stops us from just saying "this continuous function will now be a distribution as long as it satisfies the necessary axioms." Is there something more rigourous to this? (perhpas I just haven't arrived there yet in my studies).
distributions fitting estimators theory
$endgroup$
Currently I am doing an introduction to parameter estimation and fitting probability distributions to sets of data. So in a small synopsis what I understand the whole process to be like is the following:
1) We collect a large amount of raw data, which comes from an underlying probability distribution. We then "graph" the data (perhaps in the form of a bar chart or something similar at least in the 2D and 3D cases).
2) Observing this visual presentation we go through our list of existing probability distributions and form an opinion on which distribution appears to fit the data most precisely.
3) We then take a large sample from this data and attempt to estimate the parameters of our chosen probability distribution by using the array of techinques available at our disposal.
I have a few questions:
i) Is the outline above the procedure used to get parameter estimates?
ii) (more important) What stops us from making any function a probability distribution? What I mean is we have this visual representation of the data, perhaps none of the known probability distributions that we have presently align with the data. What stops us from just saying "this continuous function will now be a distribution as long as it satisfies the necessary axioms." Is there something more rigourous to this? (perhpas I just haven't arrived there yet in my studies).
distributions fitting estimators theory
distributions fitting estimators theory
asked Feb 3 at 19:59
dc3rddc3rd
15816
15816
1
$begingroup$
To your last question: A probability distribution do not need to be any of the "named" distributions. Any nonnegative function that integrates to 1 can be used as a probability density. Maybe you should look for nonparametric methods.
$endgroup$
– kjetil b halvorsen
Feb 3 at 21:36
add a comment |
1
$begingroup$
To your last question: A probability distribution do not need to be any of the "named" distributions. Any nonnegative function that integrates to 1 can be used as a probability density. Maybe you should look for nonparametric methods.
$endgroup$
– kjetil b halvorsen
Feb 3 at 21:36
1
1
$begingroup$
To your last question: A probability distribution do not need to be any of the "named" distributions. Any nonnegative function that integrates to 1 can be used as a probability density. Maybe you should look for nonparametric methods.
$endgroup$
– kjetil b halvorsen
Feb 3 at 21:36
$begingroup$
To your last question: A probability distribution do not need to be any of the "named" distributions. Any nonnegative function that integrates to 1 can be used as a probability density. Maybe you should look for nonparametric methods.
$endgroup$
– kjetil b halvorsen
Feb 3 at 21:36
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
1) We collect a large amount of raw data, which comes from an
underlying probability distribution. We then "graph" the data (perhaps
in the form of a bar chart or something similar at least in the 2D and
3D cases).
If the data has more then two dimensions (usually the case), then you cannot graph it. You can graph only the marginal distribution, but not the joint distribution.
2) Observing this visual presentation we go through our list of
existing probability distributions and form an opinion on which
distribution appears to fit the data most precisely.
No. First of all, as stated above, graphs don't tell you the whole story. Second, many distributions can look very similar. Third, there is no such a thing as "list of existing distributions". You can go through the list of popular distributions, but the list of all possible distributions is infinite (you can come up with your own distribution, you can define mixtures of any number of any distributions -- this alone makes the list infinite).
Usually based on what you know about the data (given plots, summary statistics, knowledge on what the data represents, how it was collected) you choose some distribution or few distributions that make sense for this data. For example, if it is a count of independent binary things in fixed number of trials, then most likely you will be using binomial distribution. To understand better when what distributions make sense, you can check the Statistics 110 lectures by Joe Blitzstein.
Moreover, even if you would try several different distributions, then you wouldn't do it based on how the data looks, but rather based on model fit statistics (see questions tagged as model-selection).
3) We then take a large sample from this data and attempt to estimate
the parameters of our chosen probability distribution by using the
array of techinques available at our disposal.
Generally yes, if possible.
ii) (more important) What stops us from making any function a
probability distribution? What I mean is we have this visual
representation of the data, perhaps none of the known probability
distributions that we have presently align with the data. What stops
us from just saying "this continuous function will now be a
distribution as long as it satisfies the necessary axioms." Is there
something more rigourous to this? (perhpas I just haven't arrived
there yet in my studies).
If the function follows the mathematical definition of probability density function, or probability mass function, then it is the function. Usually it is not about finding the distribution that looks exactly like your data. If you wanted it to look like your data, then you would use empirical distribution function or things like kernel density, that would look exactly like your data. In most cases we choose simpler distributions, that look approximately like the empirical distribution. We use the distributions to build simplified models of reality, that can be extended beyond the data you collected. Saying it differently, you don't want the distribution to overfitt to your data.
Here you can find example: What is meant by using a probability distribution to model the output data for a regression problem?
$endgroup$
1
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f390608%2fbased-on-the-ideas-of-parameter-estimation-and-fitting-probability-distributions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
1) We collect a large amount of raw data, which comes from an
underlying probability distribution. We then "graph" the data (perhaps
in the form of a bar chart or something similar at least in the 2D and
3D cases).
If the data has more then two dimensions (usually the case), then you cannot graph it. You can graph only the marginal distribution, but not the joint distribution.
2) Observing this visual presentation we go through our list of
existing probability distributions and form an opinion on which
distribution appears to fit the data most precisely.
No. First of all, as stated above, graphs don't tell you the whole story. Second, many distributions can look very similar. Third, there is no such a thing as "list of existing distributions". You can go through the list of popular distributions, but the list of all possible distributions is infinite (you can come up with your own distribution, you can define mixtures of any number of any distributions -- this alone makes the list infinite).
Usually based on what you know about the data (given plots, summary statistics, knowledge on what the data represents, how it was collected) you choose some distribution or few distributions that make sense for this data. For example, if it is a count of independent binary things in fixed number of trials, then most likely you will be using binomial distribution. To understand better when what distributions make sense, you can check the Statistics 110 lectures by Joe Blitzstein.
Moreover, even if you would try several different distributions, then you wouldn't do it based on how the data looks, but rather based on model fit statistics (see questions tagged as model-selection).
3) We then take a large sample from this data and attempt to estimate
the parameters of our chosen probability distribution by using the
array of techinques available at our disposal.
Generally yes, if possible.
ii) (more important) What stops us from making any function a
probability distribution? What I mean is we have this visual
representation of the data, perhaps none of the known probability
distributions that we have presently align with the data. What stops
us from just saying "this continuous function will now be a
distribution as long as it satisfies the necessary axioms." Is there
something more rigourous to this? (perhpas I just haven't arrived
there yet in my studies).
If the function follows the mathematical definition of probability density function, or probability mass function, then it is the function. Usually it is not about finding the distribution that looks exactly like your data. If you wanted it to look like your data, then you would use empirical distribution function or things like kernel density, that would look exactly like your data. In most cases we choose simpler distributions, that look approximately like the empirical distribution. We use the distributions to build simplified models of reality, that can be extended beyond the data you collected. Saying it differently, you don't want the distribution to overfitt to your data.
Here you can find example: What is meant by using a probability distribution to model the output data for a regression problem?
$endgroup$
1
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
add a comment |
$begingroup$
1) We collect a large amount of raw data, which comes from an
underlying probability distribution. We then "graph" the data (perhaps
in the form of a bar chart or something similar at least in the 2D and
3D cases).
If the data has more then two dimensions (usually the case), then you cannot graph it. You can graph only the marginal distribution, but not the joint distribution.
2) Observing this visual presentation we go through our list of
existing probability distributions and form an opinion on which
distribution appears to fit the data most precisely.
No. First of all, as stated above, graphs don't tell you the whole story. Second, many distributions can look very similar. Third, there is no such a thing as "list of existing distributions". You can go through the list of popular distributions, but the list of all possible distributions is infinite (you can come up with your own distribution, you can define mixtures of any number of any distributions -- this alone makes the list infinite).
Usually based on what you know about the data (given plots, summary statistics, knowledge on what the data represents, how it was collected) you choose some distribution or few distributions that make sense for this data. For example, if it is a count of independent binary things in fixed number of trials, then most likely you will be using binomial distribution. To understand better when what distributions make sense, you can check the Statistics 110 lectures by Joe Blitzstein.
Moreover, even if you would try several different distributions, then you wouldn't do it based on how the data looks, but rather based on model fit statistics (see questions tagged as model-selection).
3) We then take a large sample from this data and attempt to estimate
the parameters of our chosen probability distribution by using the
array of techinques available at our disposal.
Generally yes, if possible.
ii) (more important) What stops us from making any function a
probability distribution? What I mean is we have this visual
representation of the data, perhaps none of the known probability
distributions that we have presently align with the data. What stops
us from just saying "this continuous function will now be a
distribution as long as it satisfies the necessary axioms." Is there
something more rigourous to this? (perhpas I just haven't arrived
there yet in my studies).
If the function follows the mathematical definition of probability density function, or probability mass function, then it is the function. Usually it is not about finding the distribution that looks exactly like your data. If you wanted it to look like your data, then you would use empirical distribution function or things like kernel density, that would look exactly like your data. In most cases we choose simpler distributions, that look approximately like the empirical distribution. We use the distributions to build simplified models of reality, that can be extended beyond the data you collected. Saying it differently, you don't want the distribution to overfitt to your data.
Here you can find example: What is meant by using a probability distribution to model the output data for a regression problem?
$endgroup$
1
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
add a comment |
$begingroup$
1) We collect a large amount of raw data, which comes from an
underlying probability distribution. We then "graph" the data (perhaps
in the form of a bar chart or something similar at least in the 2D and
3D cases).
If the data has more then two dimensions (usually the case), then you cannot graph it. You can graph only the marginal distribution, but not the joint distribution.
2) Observing this visual presentation we go through our list of
existing probability distributions and form an opinion on which
distribution appears to fit the data most precisely.
No. First of all, as stated above, graphs don't tell you the whole story. Second, many distributions can look very similar. Third, there is no such a thing as "list of existing distributions". You can go through the list of popular distributions, but the list of all possible distributions is infinite (you can come up with your own distribution, you can define mixtures of any number of any distributions -- this alone makes the list infinite).
Usually based on what you know about the data (given plots, summary statistics, knowledge on what the data represents, how it was collected) you choose some distribution or few distributions that make sense for this data. For example, if it is a count of independent binary things in fixed number of trials, then most likely you will be using binomial distribution. To understand better when what distributions make sense, you can check the Statistics 110 lectures by Joe Blitzstein.
Moreover, even if you would try several different distributions, then you wouldn't do it based on how the data looks, but rather based on model fit statistics (see questions tagged as model-selection).
3) We then take a large sample from this data and attempt to estimate
the parameters of our chosen probability distribution by using the
array of techinques available at our disposal.
Generally yes, if possible.
ii) (more important) What stops us from making any function a
probability distribution? What I mean is we have this visual
representation of the data, perhaps none of the known probability
distributions that we have presently align with the data. What stops
us from just saying "this continuous function will now be a
distribution as long as it satisfies the necessary axioms." Is there
something more rigourous to this? (perhpas I just haven't arrived
there yet in my studies).
If the function follows the mathematical definition of probability density function, or probability mass function, then it is the function. Usually it is not about finding the distribution that looks exactly like your data. If you wanted it to look like your data, then you would use empirical distribution function or things like kernel density, that would look exactly like your data. In most cases we choose simpler distributions, that look approximately like the empirical distribution. We use the distributions to build simplified models of reality, that can be extended beyond the data you collected. Saying it differently, you don't want the distribution to overfitt to your data.
Here you can find example: What is meant by using a probability distribution to model the output data for a regression problem?
$endgroup$
1) We collect a large amount of raw data, which comes from an
underlying probability distribution. We then "graph" the data (perhaps
in the form of a bar chart or something similar at least in the 2D and
3D cases).
If the data has more then two dimensions (usually the case), then you cannot graph it. You can graph only the marginal distribution, but not the joint distribution.
2) Observing this visual presentation we go through our list of
existing probability distributions and form an opinion on which
distribution appears to fit the data most precisely.
No. First of all, as stated above, graphs don't tell you the whole story. Second, many distributions can look very similar. Third, there is no such a thing as "list of existing distributions". You can go through the list of popular distributions, but the list of all possible distributions is infinite (you can come up with your own distribution, you can define mixtures of any number of any distributions -- this alone makes the list infinite).
Usually based on what you know about the data (given plots, summary statistics, knowledge on what the data represents, how it was collected) you choose some distribution or few distributions that make sense for this data. For example, if it is a count of independent binary things in fixed number of trials, then most likely you will be using binomial distribution. To understand better when what distributions make sense, you can check the Statistics 110 lectures by Joe Blitzstein.
Moreover, even if you would try several different distributions, then you wouldn't do it based on how the data looks, but rather based on model fit statistics (see questions tagged as model-selection).
3) We then take a large sample from this data and attempt to estimate
the parameters of our chosen probability distribution by using the
array of techinques available at our disposal.
Generally yes, if possible.
ii) (more important) What stops us from making any function a
probability distribution? What I mean is we have this visual
representation of the data, perhaps none of the known probability
distributions that we have presently align with the data. What stops
us from just saying "this continuous function will now be a
distribution as long as it satisfies the necessary axioms." Is there
something more rigourous to this? (perhpas I just haven't arrived
there yet in my studies).
If the function follows the mathematical definition of probability density function, or probability mass function, then it is the function. Usually it is not about finding the distribution that looks exactly like your data. If you wanted it to look like your data, then you would use empirical distribution function or things like kernel density, that would look exactly like your data. In most cases we choose simpler distributions, that look approximately like the empirical distribution. We use the distributions to build simplified models of reality, that can be extended beyond the data you collected. Saying it differently, you don't want the distribution to overfitt to your data.
Here you can find example: What is meant by using a probability distribution to model the output data for a regression problem?
answered Feb 3 at 21:38
Tim♦Tim
58.2k9128220
58.2k9128220
1
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
add a comment |
1
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
1
1
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
$begingroup$
Thank you very much for this explanation. It really does provide me with the clarity that I was looking for.
$endgroup$
– dc3rd
Feb 3 at 21:45
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f390608%2fbased-on-the-ideas-of-parameter-estimation-and-fitting-probability-distributions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
To your last question: A probability distribution do not need to be any of the "named" distributions. Any nonnegative function that integrates to 1 can be used as a probability density. Maybe you should look for nonparametric methods.
$endgroup$
– kjetil b halvorsen
Feb 3 at 21:36