《Neural Networks and Deep Learning》的原文摘录

  • 那么有一个较为可信的理由可以解释为什么用10个输出而非4个:如果有4个输出,那么第1个神经元将设法判断数字的最高有效位是什么。数字的最高有效位于数字形状不容易联系到一起。 (查看原文)
    miao 1赞 2020-11-14 06:52:27
    —— 引自章节:1.4 一个简单的神经网络:分类手写数字 13
  • With that said, this idea of preferring simpler explanation should make you nervous. People sometimes refer to this idea as "Occam's Razor", and will zealously apply it as though it has the status of some general scientific principle. But, of course, it's not a general scientific principle. There is no a priori logical reason to prefer simple explanations over more complex explanations. Indeed, sometimes the more complex explanation turns out to be correct. (查看原文)
    以地之名 2017-01-06 20:54:38
    —— 引自章节:improving the way neural netwo
  • The end result is a network which breaks down a very complicated question - does this image show a face or not - into very simple questions answerable at the level of single pixels. It does this through a series of many layers, with early layers answering very simple and specific questions about the input image, and later layers building up a hierarchy of ever more complex and abstract concepts. Networks with this kind of many-layer structure - two or more hidden layers - are called deep neural networks. Of course, I haven't said how to do this recursive decomposition into sub-networks. It certainly isn't practical to hand-design the weights and biases in the network. Instead, we'd like to use learning algorithms so that the network can automatically learn the weights and biases - and thus... (查看原文)
    M·贺六浑 1回复 2017-08-12 16:07:01
    —— 引自第1页
  • Neural networks are one of the most beautiful programming paradigms ever invented. In the conventional approach to programming, we tell the computer what to do, breaking big problems up into many small, precisely defined tasks that the computer can easily perform. By contrast, in a neural network we don't tell the computer how to solve our problem. Instead, it learns from observational data, figuring out its own solution to the problem at hand. (查看原文)
    miao 2019-06-22 15:47:55
    —— 引自第1页
  • Where does the "softmax" name come from? Suppose we change the softmax function so the output activations are given by aLj=eczLj∑keczLk,(83) where c is a positive constant. Note that c=1 corresponds to the standard softmax function. But if we use a different value of c we get a different function, which is nonetheless qualitatively rather similar to the softmax. In particular, show that the output activations form a probability distribution, just as for the usual softmax. Suppose we allow c to become large, i.e., c→∞. What is the limiting value for the output activations aLj? After solving this problem it should be clear to you why we think of the c=1 function as a "softened" version of the maximum function. This is the origin of the term "softmax". (查看原文)
    越锋利 2023-04-14 23:00:45
    —— 引自章节:3.1.4 softmax 74
  • 假设神经网络错误地把一个“9”的图像分类为了“8”,我们可以计算如何修改权重和偏置,以使神经网络能够把图像分类为“9”。然后重复这项工作,反复改动权重和偏置来产生更好的输出。神经网络正是通过这种方式进行学习的。 (查看原文)
    7086 2023-09-01 15:21:45
    —— 引自章节:1.2 sigmoid神经元 7
  • “隐藏”听上去有些神秘,我第一次听到这个词时,以为它涉及某些深奥的哲学或数学涵义,实际上它仅仅意味着“既非输入也非输出”。 (查看原文)
    7086 2023-09-01 15:21:45
    —— 引自章节:1.3 神经网络的架构 11
  • 相比于神经网络中的输入层和输出层,设计隐藏层堪比艺术创作,尤其是无法将隐藏层的设计流程总结为简单的经验法则。不过,神经网络研究人员已经针对隐藏层提出了许多设计法则,它们有助于控制神经网络的行为,使之符合预期,例如可以利用这些法则估算隐藏层数量和训练神经网络所需的时间。后面会介绍其中几条设计法则。 (查看原文)
    7086 2023-09-01 15:21:45
    —— 引自章节:1.3 神经网络的架构 11
  • 为什么要介绍二次代价函数呢?毕竟,我们起初感兴趣的不是能被正确分类的图像的数量吗?为什么不尝试直接最大化这个数量,而去最小化二次代价这个间接指标呢?这是因为在神经网络中,被正确分类的图像数量所涉权重和偏置的函数并不是平滑的函数。大多数情况下,权重和偏置的微小变动完全不会影响被正确分类的图像的数量。这会导致很难通过改变权重和偏置来提升表现,而用平滑代价函数(例如二次代价函数)能更好地通过微调权重和偏置来改善效果。这就是首先研究最小化二次代价的原因,只有这样,后面才能测试分类准确度。 (查看原文)
    7086 2023-09-01 15:21:45
    —— 引自章节:1.5 利用梯度下降算法进行学习 17