
 2022-02-11 06:02


摘 要





Research on Emotional Speaker Recognition Technology


Speaker recognition technology is one kind of identification technology which makes use of the features of speech signals. With the rapid development of some biometric identification, such as fingerprint identification nowadays, speaker recognition technology has extensive application foreground. The traditional speaker recognition technology only considers using neutral utterances in training and testing speech, so the speaker being easily affected by many external factor in actual situations, many emotion states will affect speech, thus leading to the decrease of performance of the system. Such kind of recognition is called emotional speaker recognition technology.

After the study on the common technology of traditional speaker recognition technology, we further study the definition and classification of emotion, which are quite disputed in academic circles, and give the “emotion” an appropriate definition for this thesis. Based on the two jobs we have done, we further research on how emotions in speech have influence on the fundamental frequency, spectrum and pattern recognition of speech signals. Through experiment, we make the conclusion how emotion have the specific influence on speaker recognition system.

In the end, we propose a score selection method to solve this influence caused by emotions. Based on the thread of emotion shielding, this method is suitable for the situation where the testing utterance is mixed with neutral and emotional speech. Through this method, we can shield the more emotional part in the testing utterance as we can for the purpose of decreasing the emotional ratio, thus increasing the performance of emotional speaker recognition system.

Keywords:Speaker Recognition, Emotional Utterance, Emotional Pattern, Pattern Recognition, Utterance Base, Emotion Shielding

目 录

摘要 ……………………………………………………………………………………………Ⅰ

Abstract ………………………………………………………………………………………II

第一章 绪论 …………………………………………………………………………………1

1.1 引言 ………………………………………………………………………………1

1.2 情感说话人识别概述 ……………………………………………………………1

1.3 情感说话人识别常用解决思路 …………………………………………………1

1.4 本文主要研究内容 ………………………………………………………………2

第二章 说话人识别系统 ……………………………………………………………………3

2.1概述 ………………………………………………………………………………3

2.2 梅尔倒谱系数 ……………………………………………………………………3

2.3 高斯混合模型(GMM) ……………………………………………………………4

2.3.1 高斯混合模型 ……………………………………………………………4

2.3.2 GMM-UBM结构的说话人识别 …………………………………………6

2.4 说话人识别的性能评价标准 ……………………………………………………7

2.4.1 错误接受率、错误拒绝率 ………………………………………………7

2.4.2 等错误率和DET图 ……………………………………………………8

2.5 本章小结 …………………………………………………………………………8

第三章 语音信号中的情感因素 ………………………………………………………………9

3.1 情感的定义 ………………………………………………………………………9

3.2 情感的分类 ………………………………………………………………………10

3.3 情感语音数据库的建立 ………………………………………………………11

3.4 情感因素对说话人特征的影响 ………………………………………………12

3.4.1 情感因素对基音频率的影响 ………………………………………12

3.4.2 情感因素对频谱的影响 ……………………………………………12

3.4.3 情感因素对系统性能的影响 …………………………………………12

3.5 本章小结 ………………………………………………………………………13

第四章 基于得分选择方法的情感说话人识别 …………………………………………14

4.1 中性语音与情感语音的混合 ……………………………………………………14

4.2 情感比例与系统性能的关系 ……………………………………………………14

4.2.1 情感比例与说话人识别系统准确率的关系 ……………………………14

4.2.2 得分计算的分布规律 …………………………………………………15

4.2.3 实验总结 ………………………………………………………………16

4.3 得分选择方法的介绍 …………………………………………………………16

4.3.1 设计得分选择步骤 ……………………………………………………………16

4.3.2 阈值对性能提升的影响 ………………………………………………………18

4.4 本章小结 ………………………………………………………………………19

第五章 总结与展望 ………………………………………………………………………20

5.1 总结 ………………………………………………………………………………20

5.2 展望 ………………………………………………………………………………20

参考文献(References) ……………………………………………………………………21

第一章 绪 论


您需要先支付 80元 才能查看全部内容!立即支付
