内容简介 · · · · · ·
http://neuralnetworksanddeeplearning.com/
目录 · · · · · ·
What this book is about
On the exercises and problems
Using neural nets to recognize handwritten digits
How the backpropagation algorithm works
Improving the way neural networks learn
· · · · · · (更多)
What this book is about
On the exercises and problems
Using neural nets to recognize handwritten digits
How the backpropagation algorithm works
Improving the way neural networks learn
A visual proof that neural nets can compute any function
Why are deep neural networks hard to train?
Deep learning
Appendix: Is there a simple algorithm for intelligence?
Acknowledgements
Frequently Asked Questions
· · · · · · (收起)
喜欢读"Neural Networks and Deep Learning"的人也喜欢的电子书 · · · · · ·
喜欢读"Neural Networks and Deep Learning"的人也喜欢 · · · · · ·
Neural Networks and Deep Learning的话题 · · · · · · ( 全部 条 )
Neural Networks and Deep Learning的书评 · · · · · · ( 全部 3 条 )
> 更多书评 3篇

Conficlown (自知 自信 自立 自制 自赎)
disection and explanation of CNN code Improvements need to be done as exercises Prospects of NN and deep learning Some philosphical discussions and a rather optimisitic view on the development of machine learning in the whole Many good reference links that are worth revisiting20200803 18:38
disection and explanation of CNN code Improvements need to be done as exercises Prospects of NN and deep learning Some philosphical discussions and a rather optimisitic view on the development of machine learning in the whole Many good reference links that are worth revisiting
回应 20200803 18:38 
Conficlown (自知 自信 自立 自制 自赎)
Problems could arise when NN has more layers: unstable gradient problem  vanishing or exploding gradient I appreciate a lot the way the problem is explained by always starting from the most simple structure as example, and then formulated that similarities can be drawn for the general case. The sigmoid function can induce such issues and in an MIT lecture, it is briefly mentioned that ReLU can...20200731 14:34
Problems could arise when NN has more layers: unstable gradient problem  vanishing or exploding gradient I appreciate a lot the way the problem is explained by always starting from the most simple structure as example, and then formulated that similarities can be drawn for the general case.
The sigmoid function can induce such issues and in an MIT lecture, it is briefly mentioned that ReLU can avoid this.
More papers to read in spare time. Authors of the many of the referenced papers are big names in the domain.
回应 20200731 14:34 
Conficlown (自知 自信 自立 自制 自赎)
This chapter is a theoretical digress into the subject of "universality"  that is, the neuro networks can approximate any continuous function. Personally I'm intrigued to see that some notions from functional analysis are mentioned in one of the reference papers which are supposedly used for the rigorous proof. The chapter itself, however, delivers an intuitive illustration of the way the uni...20200731 00:00
This chapter is a theoretical digress into the subject of "universality"  that is, the neuro networks can approximate any continuous function.
Personally I'm intrigued to see that some notions from functional analysis are mentioned in one of the reference papers which are supposedly used for the rigorous proof.
The chapter itself, however, delivers an intuitive illustration of the way the universality could be proved.
回应 20200731 00:00 
Conficlown (自知 自信 自立 自制 自赎)
How to improve NN: slowness of learning when solution saturates on the wrong value: discussed solutions:  cross entropy cost function  softmax on output layer The rational behind cross entropy seems quite related to the intrinsic property of the sigmoid function. From MIT Linear Algebra course, it seems that ReLU can achieve the same without the complexity here introduced by sigmoid problem o...20200728 22:59
How to improve NN: slowness of learning when solution saturates on the wrong value: discussed solutions:  cross entropy cost function  softmax on output layer The rational behind cross entropy seems quite related to the intrinsic property of the sigmoid function. From MIT Linear Algebra course, it seems that ReLU can achieve the same without the complexity here introduced by sigmoid
problem of overfitting and how to reduce it:  Regularization: L1 or L2 The rational behind opting for "simpler" solution can draw analog in "Occam's Razor" way of reasoning. But as the chapter points out, it remains something deeply empirical and there's no silver bullet.  Dropout The rational behind this is the thinking of averaging different sets of networks https://papers.nips.cc/paper/4824imagenetclassificationwithdeepconvolutionalneuralnetworks.pdf *The anecdote from Dr Freeman Dyson with Enrico Fermi is fascinating  Artificially expanding the training data
Weight initialization It's out of the concern when the intialization of weights starts all with N(0,1), one layer after the weights can be with larger deviation which can lead to saturation on wrong value.
Experiences, heuristics on how to tune hyperparameters:  start by simplify the scenarios  accelerate feedback loop: reduce training data, faster feedback in each epoch, etc.
回应 20200728 22:59

从机器学习到深度神经网络…… The end result is a network which breaks down a very complicated question  does this image show a face or not  into very simple questions answerable at the level of single pixels. It does this through a series of many layers, with early layers answering very simple and specific questions about the input image, and later layers building up a hierarchy of ever more ... (1回应)
20170812 16:07

miao (as simple as possible)
Neural networks are one of the most beautiful programming paradigms ever invented. In the conventional approach to programming, we tell the computer what to do, breaking big problems up into many small, precisely defined tasks that the computer can easily perform. By contrast, in a neural network we don't tell the computer how to solve our problem. Instead, it learns from observational data, fi...20190622 15:47

Conficlown (自知 自信 自立 自制 自赎)
disection and explanation of CNN code Improvements need to be done as exercises Prospects of NN and deep learning Some philosphical discussions and a rather optimisitic view on the development of machine learning in the whole Many good reference links that are worth revisiting20200803 18:38
disection and explanation of CNN code Improvements need to be done as exercises Prospects of NN and deep learning Some philosphical discussions and a rather optimisitic view on the development of machine learning in the whole Many good reference links that are worth revisiting
回应 20200803 18:38 
Conficlown (自知 自信 自立 自制 自赎)
Problems could arise when NN has more layers: unstable gradient problem  vanishing or exploding gradient I appreciate a lot the way the problem is explained by always starting from the most simple structure as example, and then formulated that similarities can be drawn for the general case. The sigmoid function can induce such issues and in an MIT lecture, it is briefly mentioned that ReLU can...20200731 14:34
Problems could arise when NN has more layers: unstable gradient problem  vanishing or exploding gradient I appreciate a lot the way the problem is explained by always starting from the most simple structure as example, and then formulated that similarities can be drawn for the general case.
The sigmoid function can induce such issues and in an MIT lecture, it is briefly mentioned that ReLU can avoid this.
More papers to read in spare time. Authors of the many of the referenced papers are big names in the domain.
回应 20200731 14:34

Conficlown (自知 自信 自立 自制 自赎)
disection and explanation of CNN code Improvements need to be done as exercises Prospects of NN and deep learning Some philosphical discussions and a rather optimisitic view on the development of machine learning in the whole Many good reference links that are worth revisiting20200803 18:38
disection and explanation of CNN code Improvements need to be done as exercises Prospects of NN and deep learning Some philosphical discussions and a rather optimisitic view on the development of machine learning in the whole Many good reference links that are worth revisiting
回应 20200803 18:38 
Conficlown (自知 自信 自立 自制 自赎)
Problems could arise when NN has more layers: unstable gradient problem  vanishing or exploding gradient I appreciate a lot the way the problem is explained by always starting from the most simple structure as example, and then formulated that similarities can be drawn for the general case. The sigmoid function can induce such issues and in an MIT lecture, it is briefly mentioned that ReLU can...20200731 14:34
Problems could arise when NN has more layers: unstable gradient problem  vanishing or exploding gradient I appreciate a lot the way the problem is explained by always starting from the most simple structure as example, and then formulated that similarities can be drawn for the general case.
The sigmoid function can induce such issues and in an MIT lecture, it is briefly mentioned that ReLU can avoid this.
More papers to read in spare time. Authors of the many of the referenced papers are big names in the domain.
回应 20200731 14:34 
Conficlown (自知 自信 自立 自制 自赎)
This chapter is a theoretical digress into the subject of "universality"  that is, the neuro networks can approximate any continuous function. Personally I'm intrigued to see that some notions from functional analysis are mentioned in one of the reference papers which are supposedly used for the rigorous proof. The chapter itself, however, delivers an intuitive illustration of the way the uni...20200731 00:00
This chapter is a theoretical digress into the subject of "universality"  that is, the neuro networks can approximate any continuous function.
Personally I'm intrigued to see that some notions from functional analysis are mentioned in one of the reference papers which are supposedly used for the rigorous proof.
The chapter itself, however, delivers an intuitive illustration of the way the universality could be proved.
回应 20200731 00:00 
Conficlown (自知 自信 自立 自制 自赎)
How to improve NN: slowness of learning when solution saturates on the wrong value: discussed solutions:  cross entropy cost function  softmax on output layer The rational behind cross entropy seems quite related to the intrinsic property of the sigmoid function. From MIT Linear Algebra course, it seems that ReLU can achieve the same without the complexity here introduced by sigmoid problem o...20200728 22:59
How to improve NN: slowness of learning when solution saturates on the wrong value: discussed solutions:  cross entropy cost function  softmax on output layer The rational behind cross entropy seems quite related to the intrinsic property of the sigmoid function. From MIT Linear Algebra course, it seems that ReLU can achieve the same without the complexity here introduced by sigmoid
problem of overfitting and how to reduce it:  Regularization: L1 or L2 The rational behind opting for "simpler" solution can draw analog in "Occam's Razor" way of reasoning. But as the chapter points out, it remains something deeply empirical and there's no silver bullet.  Dropout The rational behind this is the thinking of averaging different sets of networks https://papers.nips.cc/paper/4824imagenetclassificationwithdeepconvolutionalneuralnetworks.pdf *The anecdote from Dr Freeman Dyson with Enrico Fermi is fascinating  Artificially expanding the training data
Weight initialization It's out of the concern when the intialization of weights starts all with N(0,1), one layer after the weights can be with larger deviation which can lead to saturation on wrong value.
Experiences, heuristics on how to tune hyperparameters:  start by simplify the scenarios  accelerate feedback loop: reduce training data, faster feedback in each epoch, etc.
回应 20200728 22:59
这本书的其他版本 · · · · · · ( 全部2 )

人民邮电出版社 （2020）8.3分 13人读过
以下书单推荐 · · · · · · ( 全部 )
 闲着没事读读书（四） (鹿小羽)
 数据科学与人工智能 (lyb)
 T (dhcn)
 最值得学习的7本深度学习书 (飞鸟)
 书单@机器也会学习 (蒙奇)
谁读这本书?
二手市场
订阅关于Neural Networks and Deep Learning的评论:
feed: rss 2.0
0 有用 雄爷 20171122
深入浅出神经网络
0 有用 Qamber 20200522
QI 看的是他的，没想到，ML看的也是他的...
2 有用 小桥已久姚岳麓 20181012
我没有机器，也不想学习😂
4 有用 江嚣 20190419
以交互式的图形和动画的方式充分展现了基于网页的书籍相较于传统书籍的巨大优势
0 有用 High 20171030
深度学习的入门读物，写的非常好，尤其是第二章关于反向传播算法的原理相关。在结合 Coursera 上 Machine Learning 这门公开课一起看，相互参照，非常有帮助。
0 有用 Yukio 20210614
我看不懂但是大受震撼
0 有用 ahun 20210309
“activation function的选择如何让神经网络逼近一般函数”那一章解决了我一个多年的疑惑，还有交互动画，十分推荐那一章。其他部分，也就还好。
0 有用 Cangarav 20210214
讲的确实非常简单，这领域本身就这样..
0 有用 ev 20201228
槽 intuition绝了
0 有用 AnoI 20201129
书中的学习模式特别适合我，先说句牛逼吧，以后有雅兴再来吹捧作者