How to know for sure if we can learn from a given data or not?
Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?
machine-learning neural-network deep-learning data learning
add a comment |Â
up vote
4
down vote
favorite
I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?
machine-learning neural-network deep-learning data learning
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?
machine-learning neural-network deep-learning data learning
I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?
machine-learning neural-network deep-learning data learning
machine-learning neural-network deep-learning data learning
edited Sep 3 at 10:14
Media
5,54041443
5,54041443
asked Sep 3 at 8:27
Pranav Pandey
235
235
add a comment |Â
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
6
down vote
accepted
how can we know for sure
We can't.
A toy example to show why even humans can not do this for sure:
Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.
Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.
1
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
add a comment |Â
up vote
3
down vote
With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.
If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.
add a comment |Â
up vote
2
down vote
The current standard is essentially:
Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.
This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.
The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.
There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.
Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
add a comment |Â
up vote
2
down vote
One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.
In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.
Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
accepted
how can we know for sure
We can't.
A toy example to show why even humans can not do this for sure:
Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.
Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.
1
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
add a comment |Â
up vote
6
down vote
accepted
how can we know for sure
We can't.
A toy example to show why even humans can not do this for sure:
Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.
Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.
1
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
add a comment |Â
up vote
6
down vote
accepted
up vote
6
down vote
accepted
how can we know for sure
We can't.
A toy example to show why even humans can not do this for sure:
Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.
Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.
how can we know for sure
We can't.
A toy example to show why even humans can not do this for sure:
Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.
Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.
answered Sep 3 at 8:54
André
4389
4389
1
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
add a comment |Â
1
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
1
1
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
I like to troll puzzles with such "guess the next number" questions, by answering ÃÂ and providing a function which really does that.
â vsz
Sep 3 at 20:47
add a comment |Â
up vote
3
down vote
With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.
If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.
add a comment |Â
up vote
3
down vote
With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.
If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.
add a comment |Â
up vote
3
down vote
up vote
3
down vote
With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.
If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.
With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.
If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.
answered Sep 3 at 10:13
Media
5,54041443
5,54041443
add a comment |Â
add a comment |Â
up vote
2
down vote
The current standard is essentially:
Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.
This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.
The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.
There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.
Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
add a comment |Â
up vote
2
down vote
The current standard is essentially:
Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.
This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.
The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.
There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.
Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
add a comment |Â
up vote
2
down vote
up vote
2
down vote
The current standard is essentially:
Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.
This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.
The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.
There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.
Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.
The current standard is essentially:
Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.
This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.
The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.
There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.
Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.
answered Sep 3 at 9:55
Neil Slater
15.2k22356
15.2k22356
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
add a comment |Â
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
Thanks, Neil for your detailed explanation. :)
â Pranav Pandey
Sep 3 at 11:22
add a comment |Â
up vote
2
down vote
One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.
In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.
Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.
add a comment |Â
up vote
2
down vote
One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.
In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.
Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.
In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.
Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.
One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.
In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.
Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.
answered Sep 18 at 13:49
Nirav Gandhi
616
616
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37738%2fhow-to-know-for-sure-if-we-can-learn-from-a-given-data-or-not%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password