《机器学习实战》的原文摘录

  • Pros: High accuracy, insensitive to outliers, no assumptions about data Cons: Computationally expensive, requires a lot of memory Works with: Numeric values, nominal values The first machine-learning algorithm we’ll look at is k-Nearest Neighbors (kNN). It works like this: we have an existing set of example data, our training set. We have labels for all of this data—we know what class each piece of the data should fall into. When we’re given a new piece of data without a label, we compare that new piece of data to the existing data, every piece of existing data. We then take the most similar pieces of data (the nearest neighbors) and look at their labels. We look at the top k most similar pieces of data from our known dataset; this is where the k comes from. (k is an integer and it’s usua... (查看原文)
    传统保守拉拉桑 2012-12-23 15:19:40
    —— 引自第36页
  • Pros: Computationally cheap to use, easy for humans to understand learned results, missing values OK, can deal with irrelevant features Cons: Prone to overfitting Works with: Numeric values, nominal values (查看原文)
    传统保守拉拉桑 2013-01-09 09:59:52
    —— 引自第37页
  • General approach to decision trees 1. Collect: Any method. 2. Prepare: This tree-building algorithm works only on nominal values, so any continuous values will need to be quantized. 3. Analyze: Any method. You should visually inspect the tree after it is built. 4. Train: Construct a tree data structure. 5. Test: Calculate the error rate with the learned tree. 6. Use: This can be used in any supervised learning task. Often, trees are used to better understand the data. (查看原文)
    传统保守拉拉桑 2013-01-09 09:59:52
    —— 引自第37页
  • Logistic regression Pros: Computationally inexpensive, easy to implement, knowledge representation easy to interpret Cons: Prone to underfitting, may have low accuracy Works with: Numeric values, nominal values (查看原文)
    传统保守拉拉桑 2013-01-10 11:15:21
    —— 引自第83页
  • The clear syntax of Python has earned it the name executable pseudo-code. (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • With Python, you can program in any style you’re familiar with: object-oriented, procedural, functional, and so on. (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • With Python it’s easy to process and manipulate text (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • Python is popular in the scientific and financial communities as well.A number of scientific libraries such as SciPy and NumPy allow you to do vector and matrix operations. (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • The scientific tools in Python work well with a plotting tool called Matplotlib. (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • The only real drawback of Python is that it’s not as fast as Java or C. You can, however, call C-compiled programs from Python. (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • All of us learn to write in the second grade. Most of us go on to greater things. —Bobby Knight (查看原文)
    ibillxia 2013-05-09 21:33:44
    —— 引自第13页
  • k-Nearest Neighbors Pros: High accuracy, insensitive to outliers, no assumptions about data Cons: Computationally expensive, requires a lot of memory Works with: Numeric values, nominal values (查看原文)
    Andy 2013-09-25 10:05:13
    —— 引自第19页
  • Decision trees Pros: Computationally cheap to use, easy for humans to understand learned results, missing values OK, can deal with irrelevant features Cons: Prone to overfitting Works with: Numeric values, nominal values (查看原文)
    Andy 2013-09-25 22:29:40
    —— 引自第39页
  • 今天我们知道的回归是由达尔文(Charles Darwin)的表兄弟Francis Galton发明的,Galton于1877年完成了第一次回归预测,目的是根据上一代豌豆种子(双亲)的尺寸来预测下一代豌豆种子(孩子)的尺寸。Galton在大量对象上应用了回归分析,甚至包括人的身高。他注意到,如果双亲的高度比平均高度高,他们的子女也倾向于比平均身高高,但尚不及双亲。孩子的高度向着平均高度回退(回归)。Galton在多项研究上都注意到这个现象,所以尽管这个英文单词跟数值预测没有任何关系,但这种研究方法仍被成为回归。 (查看原文)
    LC 2016-01-04 22:53:01
    —— 引自第137页
  • 线性代数只是为了简化不同的数据点上执行的相同数字运算。将数据表示为矩阵形式,只需要执行简单的矩阵运算而不需要复杂的循环操作。 (查看原文)
    junior 2016-07-21 11:46:00
    —— 引自第12页
  • 4 . 5 . 2 训练算法:从词向量计算概率 (查看原文)
    wwww_wu 2017-06-11 09:17:19
    —— 引自第60页
  • 最后还需说明一点,你可能对②公式的前两行觉得陌生。此处略去了一个简单的数学推导, 我把它留给有兴趣的读者。 (查看原文)
    wwww_wu 2017-06-13 10:03:09
    —— 引自第78页