图书目录

目录

第 1章绪论 .........................................................................................1

第 2章深度学习基础 ............................................................................5 

2.1有监督学习................................................................................ 5 

2.2单层神经网络 ............................................................................ 6 

2.2.1基本模型 ........................................................................ 6 

2.2.2激活函数 ........................................................................ 7 

2.3前馈深度神经网络.....................................................................10 

2.3.1反向传播算法.................................................................11 

2.3.2正则化...........................................................................15 

2.4循环神经网络 ...........................................................................17 

2.4.1循环神经网络基础 ..........................................................17 

2.4.2长短时记忆网络 .............................................................20 

2.4.3门控循环神经网络 ..........................................................22 

2.4.4深层 RNN结构..............................................................23 

2.4.5序列数据的 RNN建模框架 .............................................25 

2.5卷积神经网络 ...........................................................................26 

2.5.1卷积神经网络基础 ..........................................................27 

2.5.2其他卷积形式.................................................................31 

2.5.3残差神经网络.................................................................35 

2.5.4时序卷积网络.................................................................37 

2.6神经网络中的归一化 .................................................................39 

2.6.1批归一化 .......................................................................39 

2.6.2层归一化 .......................................................................41 

2.7神经网络中的注意力机制...........................................................42 

2.7.1编码器-解码器框架.........................................................42 

2.7.2 编码器

-注意力机制-解码器框架 .......................................44 

2.

7.3 单调注意力机制 .............................................................46  Transformer...................................................................47

2.7.4 

2.8生成对抗网络 

...........................................................................48 

2.8.1 基本结构 

.......................................................................49 

2.8.2 模型训练 

.......................................................................51 

2.9本章小结 

..................................................................................52

第 3章语音检测................................................................................. 53 

3.1引言

.........................................................................................53 

3.2基本知识 

..................................................................................54 

3.2.1 信号模型 

.......................................................................54 

3.2.2 评价指标 

.......................................................................55 

3.3语音检测模型 

...........................................................................57 

3.

3.1 语音检测模型的基本框架 ................................................57 

3.

3.2 基于深度置信网络的语音检测 .........................................58 

3.

3.3 基于降噪深度神经网络的语音检测...................................61 

3.

3.4 基于多分辨率堆栈的语音检测模型框架 ............................63 

3.

4语音检测模型的损失函数...........................................................65 

3.

4.1 最小化交叉熵.................................................................66 

3.

4.2 最小均方误差.................................................................66 

3.4.3 最大化 

ROC曲线下面积 ................................................66 

3.

5语音检测的声学特征 .................................................................69 

3.

5.1 短时傅里叶变换的频带选择.............................................69 

3.

5.2 多分辨率类耳蜗频谱特征 ................................................70 

3.

6模型的泛化能力 ........................................................................72 

3.7本章小结 

..................................................................................73

第 4章单通道语音增强....................................................................... 75 

4.1引言

.........................................................................................75 

4.2基本知识 

..................................................................................77 

4.2.1 信号模型 

.......................................................................77 

4.2.2 评价指标 

.......................................................................79 

4.3频域语音增强 

...........................................................................81 

4.3.1算法框架 

.......................................................................81 

4.3.2训练目标 

.......................................................................82 

4.

3.3语音增强模型.................................................................89 

4.

3.4语音去混响模型 .............................................................93 

4.4时域语音增强 

.........................................................................100 

4.4.1关键问题 

.....................................................................101 

4.4.2卷积模型 

.....................................................................102 

4.4.3损失函数 

.....................................................................104 

4.5本章小结 

................................................................................106

第 5章多通道语音增强..................................................................... 107 

5.1引言

.......................................................................................107 

5.2信号模型 

................................................................................108 

5.

3空间特征提取法 ......................................................................109 

5.3.1空间特征 

.....................................................................109 

5.3.2深度模型 

.....................................................................111 

5.4波束形成方法 

.........................................................................113 

5.

4.1自适应波束形成器 ........................................................114 

5.4.2噪声估计 

.....................................................................116 

5.

4.3基于神经网络的波束形成方法 .......................................117 

5.

5自组织麦克风阵列方法 ............................................................121 

5.

5.1深度自组织波束形成.....................................................123 

5.

5.2通道权重估计...............................................................124 

5.

5.3通道选择算法...............................................................125 

5.6本章小结 

................................................................................131

第 6章多说话人语音分离 ................................................................. 133 

6.1引言

.......................................................................................133 

6.2信号模型 

................................................................................134 

6.

3与说话人相关的语音分离方法 ..................................................134 

6.

3.1模型匹配法 ..................................................................134 

6.

3.2声纹特征法 ..................................................................139 

6.

4与说话人无关的语音分离.........................................................142 

6.

4.1深度聚类算法...............................................................143 

6.

4.2置换不变训练算法 ........................................................146 

6.

4.3基于时域卷积的端到端语音分离算法 .............................148 

6.5本章小结 

................................................................................151

第 7章声纹识别............................................................................... 153 

7.1引言

.......................................................................................153 

7.2说话人确认

.............................................................................155 

7.

2.1说话人确认基础 ...........................................................155 

7.

2.2基于分类损失的深度嵌入说话人确认算法.......................159 

7.

2.3基于确认损失的端到端说话人确认算法 ..........................168 

7.

3说话人分割聚类 ......................................................................173 

7.

3.1说话人分割聚类基础.....................................................174 

7.

3.2分阶段说话人分割聚类 .................................................176 

7.

3.3端到端说话人分割聚类算法...........................................180 

7.4鲁棒声纹识别 

.........................................................................183 

7.

4.1结合增强前端的抗噪声纹识别 .......................................183 

7.

4.2基于无监督域自适应的鲁棒声纹识别 .............................185 

7.5本章小结 

................................................................................188

第 8章语音识别............................................................................... 191 

8.1引言

.......................................................................................191 

8.2语音识别基础 

.........................................................................193 

8.2.1信号模型 

.....................................................................193 

8.2.2评价指标 

.....................................................................193 

8.

3端到端语音识别 ......................................................................194 

8.

3.1连接时序分类模型 ........................................................194 

8.

3.2注意力机制模型 ...........................................................203 

8.

4语音识别的噪声鲁棒方法.........................................................206 

8.5说话人自适应 

.........................................................................210 

8.

5.1说话人自适应训练 ........................................................210 

8.

5.2测试阶段自适应 ...........................................................214 

8.6本章小结 

................................................................................220

参考文献 ............................................................................................. 221