What are the best books to study Neural Networks from a purely mathematical perspective?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












19












$begingroup$


I am looking for a book that goes through the mathematical aspects of neural networks, from simple forward passage of multilayer perceptron in matrix form or differentiation of activation functions, to back propagation in CNN or RNN (to mention some of the topics).



Do you know any book that goes in depth into this theory? I've had a look at a couple (such as Pattern Recognition and Machine Learning by Bishop) but still have not found a rigorous one (with exercises would be a plus). Do you have any suggestions?










share|cite|improve this question











$endgroup$
















    19












    $begingroup$


    I am looking for a book that goes through the mathematical aspects of neural networks, from simple forward passage of multilayer perceptron in matrix form or differentiation of activation functions, to back propagation in CNN or RNN (to mention some of the topics).



    Do you know any book that goes in depth into this theory? I've had a look at a couple (such as Pattern Recognition and Machine Learning by Bishop) but still have not found a rigorous one (with exercises would be a plus). Do you have any suggestions?










    share|cite|improve this question











    $endgroup$














      19












      19








      19


      12



      $begingroup$


      I am looking for a book that goes through the mathematical aspects of neural networks, from simple forward passage of multilayer perceptron in matrix form or differentiation of activation functions, to back propagation in CNN or RNN (to mention some of the topics).



      Do you know any book that goes in depth into this theory? I've had a look at a couple (such as Pattern Recognition and Machine Learning by Bishop) but still have not found a rigorous one (with exercises would be a plus). Do you have any suggestions?










      share|cite|improve this question











      $endgroup$




      I am looking for a book that goes through the mathematical aspects of neural networks, from simple forward passage of multilayer perceptron in matrix form or differentiation of activation functions, to back propagation in CNN or RNN (to mention some of the topics).



      Do you know any book that goes in depth into this theory? I've had a look at a couple (such as Pattern Recognition and Machine Learning by Bishop) but still have not found a rigorous one (with exercises would be a plus). Do you have any suggestions?







      book-recommendation machine-learning mathematical-modeling neural-networks






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Mar 18 at 19:20









      Alexander Gruber

      20k25103174




      20k25103174










      asked Mar 13 at 3:03









      EliEli

      1245




      1245




















          5 Answers
          5






          active

          oldest

          votes


















          8












          $begingroup$

          I'd recommend Deep Learning by Goodfellow, Bengio and Courville. I don't know if I'd call it "purely mathematical", but it covers a good amount of math background in the first few chapters. No exercises, though.






          share|cite|improve this answer









          $endgroup$








          • 4




            $begingroup$
            Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
            $endgroup$
            – Eli
            Mar 13 at 3:17


















          6












          $begingroup$

          For MLPs, there is a rigorous derivation in the optimization textbook by Edwin Chong and Zak. Although it is notation heavy as all things related to neural networks must be.



          This book is for some reason freely available online. See page 219 of https://eng.uok.ac.ir/mfathi/Courses/Advanced%20Eng%20Math/An%20Introduction%20to%20Optimization-%20E.%20Chong,%20S.%20Zak.pdf



          I think there is essentially no good mathematical textbook on convolutional neural networks or RNN in existence. People essentially just base their intuition off of MLPs. But it is not hard to create a mathematically rigorous derivation of forward and backward propagation of CNN or RNN.






          share|cite|improve this answer











          $endgroup$








          • 1




            $begingroup$
            "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
            $endgroup$
            – Rahul
            Mar 13 at 12:11



















          5












          $begingroup$

          Gilbert Strang (of MIT OCW Linear Algebra lectures and Introduction to Linear Algebra fame) has a new textbook on linear algebra for deep learning,
          Linear Algebra and Learning from Data.



          It's got a decent course in linear algebra, some statistics & optimization, the calculus needed for stochastic gradient descent, and then applies them all to neural network models.






          share|cite|improve this answer









          $endgroup$




















            1












            $begingroup$

            One of my favorite books on theoretical aspects of neural networks is Anthony and Bartlett's book: "Neural Network Learning
            Theoretical Foundations".



            This book studies neural networks in the context of statistical learning theory. You will find loads of estimates of VC dimensions of sets of networks and all that fun stuff.



            I should say that this book does not go into detail on CNNs and RNNs.






            share|cite|improve this answer











            $endgroup$




















              0












              $begingroup$

              Not a book but maybe of some interest for a current perspective:



              Backprop as Functor: A compositional
              perspective on supervised learning
              Brendan Fong David I. Spivak Remy Tuyeras (2018) gives a category theoretic structural framework based on the algorithm:



              https://arxiv.org/pdf/1711.10455.pdf



              This is further discussed by David Spivak (2019) via:



              https://www.reddit.com/r/math/comments/ahrar7/lectures_in_applied_category_theory_mit_2019/






              share|cite|improve this answer









              $endgroup$













                Your Answer





                StackExchange.ifUsing("editor", function ()
                return StackExchange.using("mathjaxEditing", function ()
                StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
                StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
                );
                );
                , "mathjax-editing");

                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "69"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader:
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                ,
                noCode: true, onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                draft saved

                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3146016%2fwhat-are-the-best-books-to-study-neural-networks-from-a-purely-mathematical-pers%23new-answer', 'question_page');

                );

                Post as a guest















                Required, but never shown

























                5 Answers
                5






                active

                oldest

                votes








                5 Answers
                5






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                8












                $begingroup$

                I'd recommend Deep Learning by Goodfellow, Bengio and Courville. I don't know if I'd call it "purely mathematical", but it covers a good amount of math background in the first few chapters. No exercises, though.






                share|cite|improve this answer









                $endgroup$








                • 4




                  $begingroup$
                  Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
                  $endgroup$
                  – Eli
                  Mar 13 at 3:17















                8












                $begingroup$

                I'd recommend Deep Learning by Goodfellow, Bengio and Courville. I don't know if I'd call it "purely mathematical", but it covers a good amount of math background in the first few chapters. No exercises, though.






                share|cite|improve this answer









                $endgroup$








                • 4




                  $begingroup$
                  Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
                  $endgroup$
                  – Eli
                  Mar 13 at 3:17













                8












                8








                8





                $begingroup$

                I'd recommend Deep Learning by Goodfellow, Bengio and Courville. I don't know if I'd call it "purely mathematical", but it covers a good amount of math background in the first few chapters. No exercises, though.






                share|cite|improve this answer









                $endgroup$



                I'd recommend Deep Learning by Goodfellow, Bengio and Courville. I don't know if I'd call it "purely mathematical", but it covers a good amount of math background in the first few chapters. No exercises, though.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered Mar 13 at 3:11









                Jair TaylorJair Taylor

                9,19432244




                9,19432244







                • 4




                  $begingroup$
                  Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
                  $endgroup$
                  – Eli
                  Mar 13 at 3:17












                • 4




                  $begingroup$
                  Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
                  $endgroup$
                  – Eli
                  Mar 13 at 3:17







                4




                4




                $begingroup$
                Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
                $endgroup$
                – Eli
                Mar 13 at 3:17




                $begingroup$
                Thank you - I've actually had a look at that one too, but while it is good in introducing the main mathematical tools needed for NN, I found it a bit lacking when it came to properly develop the model mathematically.
                $endgroup$
                – Eli
                Mar 13 at 3:17











                6












                $begingroup$

                For MLPs, there is a rigorous derivation in the optimization textbook by Edwin Chong and Zak. Although it is notation heavy as all things related to neural networks must be.



                This book is for some reason freely available online. See page 219 of https://eng.uok.ac.ir/mfathi/Courses/Advanced%20Eng%20Math/An%20Introduction%20to%20Optimization-%20E.%20Chong,%20S.%20Zak.pdf



                I think there is essentially no good mathematical textbook on convolutional neural networks or RNN in existence. People essentially just base their intuition off of MLPs. But it is not hard to create a mathematically rigorous derivation of forward and backward propagation of CNN or RNN.






                share|cite|improve this answer











                $endgroup$








                • 1




                  $begingroup$
                  "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
                  $endgroup$
                  – Rahul
                  Mar 13 at 12:11
















                6












                $begingroup$

                For MLPs, there is a rigorous derivation in the optimization textbook by Edwin Chong and Zak. Although it is notation heavy as all things related to neural networks must be.



                This book is for some reason freely available online. See page 219 of https://eng.uok.ac.ir/mfathi/Courses/Advanced%20Eng%20Math/An%20Introduction%20to%20Optimization-%20E.%20Chong,%20S.%20Zak.pdf



                I think there is essentially no good mathematical textbook on convolutional neural networks or RNN in existence. People essentially just base their intuition off of MLPs. But it is not hard to create a mathematically rigorous derivation of forward and backward propagation of CNN or RNN.






                share|cite|improve this answer











                $endgroup$








                • 1




                  $begingroup$
                  "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
                  $endgroup$
                  – Rahul
                  Mar 13 at 12:11














                6












                6








                6





                $begingroup$

                For MLPs, there is a rigorous derivation in the optimization textbook by Edwin Chong and Zak. Although it is notation heavy as all things related to neural networks must be.



                This book is for some reason freely available online. See page 219 of https://eng.uok.ac.ir/mfathi/Courses/Advanced%20Eng%20Math/An%20Introduction%20to%20Optimization-%20E.%20Chong,%20S.%20Zak.pdf



                I think there is essentially no good mathematical textbook on convolutional neural networks or RNN in existence. People essentially just base their intuition off of MLPs. But it is not hard to create a mathematically rigorous derivation of forward and backward propagation of CNN or RNN.






                share|cite|improve this answer











                $endgroup$



                For MLPs, there is a rigorous derivation in the optimization textbook by Edwin Chong and Zak. Although it is notation heavy as all things related to neural networks must be.



                This book is for some reason freely available online. See page 219 of https://eng.uok.ac.ir/mfathi/Courses/Advanced%20Eng%20Math/An%20Introduction%20to%20Optimization-%20E.%20Chong,%20S.%20Zak.pdf



                I think there is essentially no good mathematical textbook on convolutional neural networks or RNN in existence. People essentially just base their intuition off of MLPs. But it is not hard to create a mathematically rigorous derivation of forward and backward propagation of CNN or RNN.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited Mar 13 at 3:46

























                answered Mar 13 at 3:33









                Shamisen ExpertShamisen Expert

                2,85821946




                2,85821946







                • 1




                  $begingroup$
                  "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
                  $endgroup$
                  – Rahul
                  Mar 13 at 12:11













                • 1




                  $begingroup$
                  "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
                  $endgroup$
                  – Rahul
                  Mar 13 at 12:11








                1




                1




                $begingroup$
                "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
                $endgroup$
                – Rahul
                Mar 13 at 12:11





                $begingroup$
                "This book is for some reason freely available online." That is probably a copyright violation by the webpage owner eng.uok.ac.ir/mfathi. But I won't tell anyone if you won't ;)
                $endgroup$
                – Rahul
                Mar 13 at 12:11












                5












                $begingroup$

                Gilbert Strang (of MIT OCW Linear Algebra lectures and Introduction to Linear Algebra fame) has a new textbook on linear algebra for deep learning,
                Linear Algebra and Learning from Data.



                It's got a decent course in linear algebra, some statistics & optimization, the calculus needed for stochastic gradient descent, and then applies them all to neural network models.






                share|cite|improve this answer









                $endgroup$

















                  5












                  $begingroup$

                  Gilbert Strang (of MIT OCW Linear Algebra lectures and Introduction to Linear Algebra fame) has a new textbook on linear algebra for deep learning,
                  Linear Algebra and Learning from Data.



                  It's got a decent course in linear algebra, some statistics & optimization, the calculus needed for stochastic gradient descent, and then applies them all to neural network models.






                  share|cite|improve this answer









                  $endgroup$















                    5












                    5








                    5





                    $begingroup$

                    Gilbert Strang (of MIT OCW Linear Algebra lectures and Introduction to Linear Algebra fame) has a new textbook on linear algebra for deep learning,
                    Linear Algebra and Learning from Data.



                    It's got a decent course in linear algebra, some statistics & optimization, the calculus needed for stochastic gradient descent, and then applies them all to neural network models.






                    share|cite|improve this answer









                    $endgroup$



                    Gilbert Strang (of MIT OCW Linear Algebra lectures and Introduction to Linear Algebra fame) has a new textbook on linear algebra for deep learning,
                    Linear Algebra and Learning from Data.



                    It's got a decent course in linear algebra, some statistics & optimization, the calculus needed for stochastic gradient descent, and then applies them all to neural network models.







                    share|cite|improve this answer












                    share|cite|improve this answer



                    share|cite|improve this answer










                    answered Mar 14 at 19:43









                    Josef KnechtJosef Knecht

                    511




                    511





















                        1












                        $begingroup$

                        One of my favorite books on theoretical aspects of neural networks is Anthony and Bartlett's book: "Neural Network Learning
                        Theoretical Foundations".



                        This book studies neural networks in the context of statistical learning theory. You will find loads of estimates of VC dimensions of sets of networks and all that fun stuff.



                        I should say that this book does not go into detail on CNNs and RNNs.






                        share|cite|improve this answer











                        $endgroup$

















                          1












                          $begingroup$

                          One of my favorite books on theoretical aspects of neural networks is Anthony and Bartlett's book: "Neural Network Learning
                          Theoretical Foundations".



                          This book studies neural networks in the context of statistical learning theory. You will find loads of estimates of VC dimensions of sets of networks and all that fun stuff.



                          I should say that this book does not go into detail on CNNs and RNNs.






                          share|cite|improve this answer











                          $endgroup$















                            1












                            1








                            1





                            $begingroup$

                            One of my favorite books on theoretical aspects of neural networks is Anthony and Bartlett's book: "Neural Network Learning
                            Theoretical Foundations".



                            This book studies neural networks in the context of statistical learning theory. You will find loads of estimates of VC dimensions of sets of networks and all that fun stuff.



                            I should say that this book does not go into detail on CNNs and RNNs.






                            share|cite|improve this answer











                            $endgroup$



                            One of my favorite books on theoretical aspects of neural networks is Anthony and Bartlett's book: "Neural Network Learning
                            Theoretical Foundations".



                            This book studies neural networks in the context of statistical learning theory. You will find loads of estimates of VC dimensions of sets of networks and all that fun stuff.



                            I should say that this book does not go into detail on CNNs and RNNs.







                            share|cite|improve this answer














                            share|cite|improve this answer



                            share|cite|improve this answer








                            edited Mar 15 at 8:20

























                            answered Mar 14 at 9:56









                            pcppcp

                            1,063312




                            1,063312





















                                0












                                $begingroup$

                                Not a book but maybe of some interest for a current perspective:



                                Backprop as Functor: A compositional
                                perspective on supervised learning
                                Brendan Fong David I. Spivak Remy Tuyeras (2018) gives a category theoretic structural framework based on the algorithm:



                                https://arxiv.org/pdf/1711.10455.pdf



                                This is further discussed by David Spivak (2019) via:



                                https://www.reddit.com/r/math/comments/ahrar7/lectures_in_applied_category_theory_mit_2019/






                                share|cite|improve this answer









                                $endgroup$

















                                  0












                                  $begingroup$

                                  Not a book but maybe of some interest for a current perspective:



                                  Backprop as Functor: A compositional
                                  perspective on supervised learning
                                  Brendan Fong David I. Spivak Remy Tuyeras (2018) gives a category theoretic structural framework based on the algorithm:



                                  https://arxiv.org/pdf/1711.10455.pdf



                                  This is further discussed by David Spivak (2019) via:



                                  https://www.reddit.com/r/math/comments/ahrar7/lectures_in_applied_category_theory_mit_2019/






                                  share|cite|improve this answer









                                  $endgroup$















                                    0












                                    0








                                    0





                                    $begingroup$

                                    Not a book but maybe of some interest for a current perspective:



                                    Backprop as Functor: A compositional
                                    perspective on supervised learning
                                    Brendan Fong David I. Spivak Remy Tuyeras (2018) gives a category theoretic structural framework based on the algorithm:



                                    https://arxiv.org/pdf/1711.10455.pdf



                                    This is further discussed by David Spivak (2019) via:



                                    https://www.reddit.com/r/math/comments/ahrar7/lectures_in_applied_category_theory_mit_2019/






                                    share|cite|improve this answer









                                    $endgroup$



                                    Not a book but maybe of some interest for a current perspective:



                                    Backprop as Functor: A compositional
                                    perspective on supervised learning
                                    Brendan Fong David I. Spivak Remy Tuyeras (2018) gives a category theoretic structural framework based on the algorithm:



                                    https://arxiv.org/pdf/1711.10455.pdf



                                    This is further discussed by David Spivak (2019) via:



                                    https://www.reddit.com/r/math/comments/ahrar7/lectures_in_applied_category_theory_mit_2019/







                                    share|cite|improve this answer












                                    share|cite|improve this answer



                                    share|cite|improve this answer










                                    answered Mar 24 at 11:41









                                    Jim StuttardJim Stuttard

                                    1




                                    1



























                                        draft saved

                                        draft discarded
















































                                        Thanks for contributing an answer to Mathematics Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid


                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.

                                        Use MathJax to format equations. MathJax reference.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function ()
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3146016%2fwhat-are-the-best-books-to-study-neural-networks-from-a-purely-mathematical-pers%23new-answer', 'question_page');

                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown






                                        Popular posts from this blog

                                        Peggy Mitchell

                                        The Forum (Inglewood, California)

                                        Palaiologos