How to know for sure if we can learn from a given data or not?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite
1












I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?










share|improve this question



























    up vote
    4
    down vote

    favorite
    1












    I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?










    share|improve this question

























      up vote
      4
      down vote

      favorite
      1









      up vote
      4
      down vote

      favorite
      1






      1





      I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?










      share|improve this question















      I want to know that given a set of data and a target, how we can know for sure whether we can learn from that data to make any inference or not?







      machine-learning neural-network deep-learning data learning






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Sep 3 at 10:14









      Media

      5,54041443




      5,54041443










      asked Sep 3 at 8:27









      Pranav Pandey

      235




      235




















          4 Answers
          4






          active

          oldest

          votes

















          up vote
          6
          down vote



          accepted











          how can we know for sure




          We can't.
          A toy example to show why even humans can not do this for sure:



          Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.



          Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.






          share|improve this answer
















          • 1




            I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
            – vsz
            Sep 3 at 20:47

















          up vote
          3
          down vote













          With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.



          If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.






          share|improve this answer



























            up vote
            2
            down vote













            The current standard is essentially:




            Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.




            This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.



            The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.



            There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.



            Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.






            share|improve this answer




















            • Thanks, Neil for your detailed explanation. :)
              – Pranav Pandey
              Sep 3 at 11:22

















            up vote
            2
            down vote













            One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.



            In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.



            Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.






            share|improve this answer




















              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "557"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              noCode: true, onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37738%2fhow-to-know-for-sure-if-we-can-learn-from-a-given-data-or-not%23new-answer', 'question_page');

              );

              Post as a guest






























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              6
              down vote



              accepted











              how can we know for sure




              We can't.
              A toy example to show why even humans can not do this for sure:



              Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.



              Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.






              share|improve this answer
















              • 1




                I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
                – vsz
                Sep 3 at 20:47














              up vote
              6
              down vote



              accepted











              how can we know for sure




              We can't.
              A toy example to show why even humans can not do this for sure:



              Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.



              Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.






              share|improve this answer
















              • 1




                I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
                – vsz
                Sep 3 at 20:47












              up vote
              6
              down vote



              accepted







              up vote
              6
              down vote



              accepted







              how can we know for sure




              We can't.
              A toy example to show why even humans can not do this for sure:



              Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.



              Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.






              share|improve this answer













              how can we know for sure




              We can't.
              A toy example to show why even humans can not do this for sure:



              Assume you get the number 2, 4, 8, 16, 32, ?? and want to extrapolate to the next number ??. A natural extension of the series would be 64, but we can not take this for granted. The next number can just as well be 0. You can not be sure.



              Only given the data and without additional assumptions about what you would expect to see, you can not learn a correct model per-se. You always have to be critical about your data.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Sep 3 at 8:54









              André

              4389




              4389







              • 1




                I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
                – vsz
                Sep 3 at 20:47












              • 1




                I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
                – vsz
                Sep 3 at 20:47







              1




              1




              I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
              – vsz
              Sep 3 at 20:47




              I like to troll puzzles with such "guess the next number" questions, by answering π and providing a function which really does that.
              – vsz
              Sep 3 at 20:47










              up vote
              3
              down vote













              With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.



              If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.






              share|improve this answer
























                up vote
                3
                down vote













                With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.



                If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.






                share|improve this answer






















                  up vote
                  3
                  down vote










                  up vote
                  3
                  down vote









                  With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.



                  If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.






                  share|improve this answer












                  With respect to the presented answers, I want to add an extra explanation. Basically, what ML approaches do is approximating a mapping from inputs to outputs. This function usually should be well-behaved 1 otherwise you should have so much data to enable your model to learn it in the current feature space. To be more specific, you should find the distribution of your training data in your current feature space. It helps you investigate how much the distribution of your different labels overlap, for classification tasks. By doing that, you'll be able to figure out the best performance your best ML approach can have. The distribution of your data in your current feature space can show you the Bayes error of your model.



                  If you find out that the current Bayes error is a large value, then you can be sure that your data cannot be learned in the current feature space and you have to change the current features.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Sep 3 at 10:13









                  Media

                  5,54041443




                  5,54041443




















                      up vote
                      2
                      down vote













                      The current standard is essentially:




                      Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.




                      This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.



                      The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.



                      There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.



                      Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.






                      share|improve this answer




















                      • Thanks, Neil for your detailed explanation. :)
                        – Pranav Pandey
                        Sep 3 at 11:22














                      up vote
                      2
                      down vote













                      The current standard is essentially:




                      Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.




                      This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.



                      The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.



                      There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.



                      Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.






                      share|improve this answer




















                      • Thanks, Neil for your detailed explanation. :)
                        – Pranav Pandey
                        Sep 3 at 11:22












                      up vote
                      2
                      down vote










                      up vote
                      2
                      down vote









                      The current standard is essentially:




                      Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.




                      This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.



                      The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.



                      There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.



                      Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.






                      share|improve this answer












                      The current standard is essentially:




                      Given this input data, can any other system or approach classify it or estimate a quantity of interest? If so, then a machine learning approach may be able to achieve the same.




                      This is basically how machine learning challenges in computer perception can be treated as tractable. We have humans and other animals as working models, and make the assumption that the process can be automated. A similar approach can be made on any machine learning system which attempts to re-create the behaviour of an expert - provided we use the exact same input data, and enough of it, the ML system can learn what the expert does through statistical approximation.



                      The "expert" can be a statistician/data scientist looking at the data, using any tool. Exploratory plots of features and measures of correlation are a good way to assess whether a data set might be amenable to training a ML model for prediction. If you can visually separate classes on a scatter plot using some combination of features, then it is likely that a suitable ML model will be able to separate those classes too.



                      There are hard cases, where it seems on the surface like there is no pattern. Perhaps a relationship could be teased out and shown to exist with statistical analysis, but you could eschew that and directly throw some non-linear ML model at the problem in the hope that it finds it for you with the correct hyper-parameters. Of course you don't know in advance whether that is a worthwhile approach, and this carries some risks. But it is not that expensive to do once you have some data - just throw a fairly robust non-linear model at the problem, like XGBoost, and see what happens.



                      Of course, ML is not magic. If there is nothing to find, it will tend to find nothing. Worse than that, it can find spurious correlations, or patterns due to prejudice inherent in the data collection or labelling. Those issues are a problem regardless of evidence on whether it was theoretically possible to achieve a result at all. However, the kind of thinking that drives "let's throw some neural networks at this" has led to some published works which are quite terrifying and wrong on many levels. An example of such a system was a NN which classified a person as criminal or not according to a picture of their face - luckily flaws were pointed out in data collection on that one, but the original story made headline news in many places, despite essentially being a modern re-birth of Phrenology.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Sep 3 at 9:55









                      Neil Slater

                      15.2k22356




                      15.2k22356











                      • Thanks, Neil for your detailed explanation. :)
                        – Pranav Pandey
                        Sep 3 at 11:22
















                      • Thanks, Neil for your detailed explanation. :)
                        – Pranav Pandey
                        Sep 3 at 11:22















                      Thanks, Neil for your detailed explanation. :)
                      – Pranav Pandey
                      Sep 3 at 11:22




                      Thanks, Neil for your detailed explanation. :)
                      – Pranav Pandey
                      Sep 3 at 11:22










                      up vote
                      2
                      down vote













                      One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.



                      In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.



                      Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.






                      share|improve this answer
























                        up vote
                        2
                        down vote













                        One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.



                        In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.



                        Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.






                        share|improve this answer






















                          up vote
                          2
                          down vote










                          up vote
                          2
                          down vote









                          One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.



                          In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.



                          Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.






                          share|improve this answer












                          One cannot evaluate ML models using deterministic approach. ML models do not simply follows if else statement where one can verify whether the model predict the outcome correctly or not. Majority of ML algorithms work on probabilistic approach that predicts the most probable or near class.



                          In addition to this, distinguishing boundary between different classes are not simple and linear always and in majority of the cases the class boundary that separates the data points follows the higher order of differential functions.



                          Many a times, noise data leads the separating boundary more complex and leads to deteriorate the performance of the model. Bias- variance trade off is the important concept one should learn in order to make the model work as intended.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Sep 18 at 13:49









                          Nirav Gandhi

                          616




                          616



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37738%2fhow-to-know-for-sure-if-we-can-learn-from-a-given-data-or-not%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Popular posts from this blog

                              How to check contact read email or not when send email to Individual?

                              How many registers does an x86_64 CPU actually have?

                              Nur Jahan