目录
第 1章绪论 .........................................................................................1
第 2章深度学习基础 ............................................................................5
2.1有监督学习................................................................................ 5
2.2单层神经网络 ............................................................................ 6
2.2.1基本模型 ........................................................................ 6
2.2.2激活函数 ........................................................................ 7
2.3前馈深度神经网络.....................................................................10
2.3.1反向传播算法.................................................................11
2.3.2正则化...........................................................................15
2.4循环神经网络 ...........................................................................17
2.4.1循环神经网络基础 ..........................................................17
2.4.2长短时记忆网络 .............................................................20
2.4.3门控循环神经网络 ..........................................................22
2.4.4深层 RNN结构..............................................................23
2.4.5序列数据的 RNN建模框架 .............................................25
2.5卷积神经网络 ...........................................................................26
2.5.1卷积神经网络基础 ..........................................................27
2.5.2其他卷积形式.................................................................31
2.5.3残差神经网络.................................................................35
2.5.4时序卷积网络.................................................................37
2.6神经网络中的归一化 .................................................................39
2.6.1批归一化 .......................................................................39
2.6.2层归一化 .......................................................................41
2.7神经网络中的注意力机制...........................................................42
2.7.1编码器-解码器框架.........................................................42
2.7.2 编码器
-注意力机制-解码器框架 .......................................44
2.
7.3 单调注意力机制 .............................................................46 Transformer...................................................................47
2.7.4
2.8生成对抗网络
...........................................................................48
2.8.1 基本结构
.......................................................................49
2.8.2 模型训练
.......................................................................51
2.9本章小结
..................................................................................52
第 3章语音检测................................................................................. 53
3.1引言
.........................................................................................53
3.2基本知识
..................................................................................54
3.2.1 信号模型
.......................................................................54
3.2.2 评价指标
.......................................................................55
3.3语音检测模型
...........................................................................57
3.
3.1 语音检测模型的基本框架 ................................................57
3.
3.2 基于深度置信网络的语音检测 .........................................58
3.
3.3 基于降噪深度神经网络的语音检测...................................61
3.
3.4 基于多分辨率堆栈的语音检测模型框架 ............................63
3.
4语音检测模型的损失函数...........................................................65
3.
4.1 最小化交叉熵.................................................................66
3.
4.2 最小均方误差.................................................................66
3.4.3 最大化
ROC曲线下面积 ................................................66
3.
5语音检测的声学特征 .................................................................69
3.
5.1 短时傅里叶变换的频带选择.............................................69
3.
5.2 多分辨率类耳蜗频谱特征 ................................................70
3.
6模型的泛化能力 ........................................................................72
3.7本章小结
..................................................................................73
第 4章单通道语音增强....................................................................... 75
4.1引言
.........................................................................................75
4.2基本知识
..................................................................................77
4.2.1 信号模型
.......................................................................77
4.2.2 评价指标
.......................................................................79
4.3频域语音增强
...........................................................................81
4.3.1算法框架
.......................................................................81
4.3.2训练目标
.......................................................................82
4.
3.3语音增强模型.................................................................89
4.
3.4语音去混响模型 .............................................................93
4.4时域语音增强
.........................................................................100
4.4.1关键问题
.....................................................................101
4.4.2卷积模型
.....................................................................102
4.4.3损失函数
.....................................................................104
4.5本章小结
................................................................................106
第 5章多通道语音增强..................................................................... 107
5.1引言
.......................................................................................107
5.2信号模型
................................................................................108
5.
3空间特征提取法 ......................................................................109
5.3.1空间特征
.....................................................................109
5.3.2深度模型
.....................................................................111
5.4波束形成方法
.........................................................................113
5.
4.1自适应波束形成器 ........................................................114
5.4.2噪声估计
.....................................................................116
5.
4.3基于神经网络的波束形成方法 .......................................117
5.
5自组织麦克风阵列方法 ............................................................121
5.
5.1深度自组织波束形成.....................................................123
5.
5.2通道权重估计...............................................................124
5.
5.3通道选择算法...............................................................125
5.6本章小结
................................................................................131
第 6章多说话人语音分离 ................................................................. 133
6.1引言
.......................................................................................133
6.2信号模型
................................................................................134
6.
3与说话人相关的语音分离方法 ..................................................134
6.
3.1模型匹配法 ..................................................................134
6.
3.2声纹特征法 ..................................................................139
6.
4与说话人无关的语音分离.........................................................142
6.
4.1深度聚类算法...............................................................143
6.
4.2置换不变训练算法 ........................................................146
6.
4.3基于时域卷积的端到端语音分离算法 .............................148
6.5本章小结
................................................................................151
第 7章声纹识别............................................................................... 153
7.1引言
.......................................................................................153
7.2说话人确认
.............................................................................155
7.
2.1说话人确认基础 ...........................................................155
7.
2.2基于分类损失的深度嵌入说话人确认算法.......................159
7.
2.3基于确认损失的端到端说话人确认算法 ..........................168
7.
3说话人分割聚类 ......................................................................173
7.
3.1说话人分割聚类基础.....................................................174
7.
3.2分阶段说话人分割聚类 .................................................176
7.
3.3端到端说话人分割聚类算法...........................................180
7.4鲁棒声纹识别
.........................................................................183
7.
4.1结合增强前端的抗噪声纹识别 .......................................183
7.
4.2基于无监督域自适应的鲁棒声纹识别 .............................185
7.5本章小结
................................................................................188
第 8章语音识别............................................................................... 191
8.1引言
.......................................................................................191
8.2语音识别基础
.........................................................................193
8.2.1信号模型
.....................................................................193
8.2.2评价指标
.....................................................................193
8.
3端到端语音识别 ......................................................................194
8.
3.1连接时序分类模型 ........................................................194
8.
3.2注意力机制模型 ...........................................................203
8.
4语音识别的噪声鲁棒方法.........................................................206
8.5说话人自适应
.........................................................................210
8.
5.1说话人自适应训练 ........................................................210
8.
5.2测试阶段自适应 ...........................................................214
8.6本章小结
................................................................................220
参考文献 ............................................................................................. 221