图书目录

第1 章 什么是机器学习?这是一种常识,唯一特别之处在于由计算机完成···········1

1.1 我是否需要掌握大量的数学和编程背景知识才能理解机器学习············ 2

1.2 机器学习究竟是什么······· 3

1.3 如何让机器根据数据做出决策?记忆-制定-预测框架············· 6

1.4 本章小结····················· 12

第2 章 机器学习类型·····················15

2.1 标签数据和无标签数据的区别··························· 17

2.2 监督学习:处理标签数据的机器学习分支··············· 18

2.3 无监督学习:处理无标签

数据的机器学习分支······ 21

2.4 什么是强化学习············ 28

2.5 本章小结····················· 30

2.6 练习··························· 31

第3 章 在点附近画一条线:线性

回归····································33

3.1 问题:预测房屋的价格··· 35

3.2 解决方案:建立房价回归模型··························· 35

3.3 如何让计算机绘制出这

条线:线性回归算法······ 41

3.4 如何衡量结果?误差函数··························· 54

3.5 实际应用:使用Turi Create预测房价····················· 61

3.6 如果数据不在一行怎么办?

多项式回归·················· 63

3.7 参数和超参数··············· 64

3.8 回归应用····················· 65

3.9 本章小结····················· 66

3.10 练习························· 66

第4 章 优化训练过程:欠拟合、过拟合、测试和正则化······ 69

4.1 使用多项式回归的欠拟合和过拟合示例··············· 71

4.2 如何让计算机选择正确的模型?测试·················· 73

4.3 我们在哪里打破了黄金法则,如何解决呢?验证集························ 75

4.4 一种决定模型复杂度的数值方法:模型复杂度图··························· 76

4.5 避免过拟合的另一种选择:正则化························ 77

4.6 使用Turi Create 进行多项式回归、测试和正则化······ 85

4.7 本章小结····················· 89

4.8 练习··························· 90

第5 章 使用线来划分点: 感知器算法····································93

5.1 问题:我们在一个外星球上,听不懂外星人的语言······ 95

5.2 如何确定分类器的好坏?误差函数··················· 108

5.3 如何找到一个好的分类器?感知器算法················ 115

5.4 感知器算法编程实现···· 123

5.5 感知器算法的应用······· 128

5.6 本章小结··················· 129

5.7 练习························· 130

第6 章 划分点的连续方法:逻辑分类器··································133

6.1 逻辑分类器:连续版感知器分类器··················· 134

6.2 如何找到一个好的逻辑分类器?逻辑回归算法······· 144

6.3 对逻辑回归算法进行编程························· 150

6.4 实际应用:使用Turi Create对IMDB 评论进行分类························· 154

6.5 多分类:softmax 函数·· 156

6.6 本章小结··················· 157

6.7 练习························· 158

第7 章 如何衡量分类模型?准确率和其他相关概念·················· 159

7.1 准确率:模型的正确频率是多少······················ 160

7.2 如何解决准确率问题?定义不同类型的误差以及如何进行衡量············· 161

7.3 一个有用的模型评价工具

ROC 曲线·················· 170

7.4 本章小结··················· 179

7.5 练习························· 181

第8 章 使用概率最大化:

朴素贝叶斯模型··············· 183

8.1 生病还是健康?以贝叶斯定理为主角的故事······· 184

8.2 用例:垃圾邮件检测模型························· 188

8.3 使用真实数据构建垃圾邮件检测模型············· 201

8.4 本章小结··················· 204

8.5 练习························· 205

第9 章 通过提问划分数据:决策树····························· 207

9.1 问题:需要根据用户可能下载的内容向用户推荐

应用························· 213

9.2 解决方案:构建应用推荐系统························· 214

9.3 超出“是”或“否”之类的问题················ 228

9.4 决策树的图形边界······· 231

9.5 实际应用:使用Scikit-Learn 构建招生模型····· 234

9.6 用于回归的决策树······· 238

9.7 应用························· 241

9.8 本章小结··················· 242

9.9 练习························· 242

第10 章 组合积木以获得更多力量:

神经网络························245

10.1 以更复杂的外星球为例,开

启神经网络学习········ 247

10.2 训练神经网络··········· 258

10.3 Keras 中的神经网络编程······················· 264

10.4 用于回归的神经网络·· 272

10.5 用于更复杂数据集的其他架构················· 273

10.6 本章小结················· 275

10.7 练习······················· 276

第11 章 用风格寻找界限:支持向量机和内核方法··········279

11.1 使用新的误差函数构建更好的分类器··········· 281

11.2 Scikit-Learn 中的SVM编程······················· 287

11.3 训练非线性边界的SVM:

内核方法················· 289

11.4 本章小结················· 308

11.5 练习······················· 309

第12 章 组合模型以最大化结果:

集成学习························311

12.1 获取朋友的帮助········ 312

12.2 bagging:随机组合弱学习器以构建强学习器····· 314

12.3 AdaBoost:以智能方式组合弱学习器以构建强学习器···················· 319

12.4 梯度提升:使用决策树构建强学习器··········· 327

12.5 XGBoost:一种梯度提升

的极端方法·············· 332

12.6 集成方法的应用········ 340

12.7 本章小结················· 341

12.8 练习······················· 341

第13 章 理论付诸实践:数据工程和

机器学习真实示例········· 343

13.1 泰坦尼克号数据集····· 344

13.2 清洗数据集:缺失值及其处理方法·············· 348

13.3 特征工程:在训练模型之前转换数据集中的特征······················· 350

13.4 训练模型················· 355

13.5 调整超参数以找到最佳模型:网格搜索········ 359

13.6 使用k 折交叉验证来重用训练和验证数据········ 362

13.7 本章小结················· 363

13.8 练习······················· 364

以下内容可扫封底二维码下载

附录A 习题解答·························· 365

附录B 梯度下降背后的数学原理:

使用导数和斜率下山········ 398

附录C 参考资料·························· 416