罗庆, 包亚萍, 俞强. 基于改进语音特征与极限学习机的语音端点检测[J]. 微电子学与计算机, 2020, 37(3): 37-41.
引用本文: 罗庆, 包亚萍, 俞强. 基于改进语音特征与极限学习机的语音端点检测[J]. 微电子学与计算机, 2020, 37(3): 37-41.
LUO Qing, BAO Ya-ping, YU Qiang. Voice activity detection based on improved speech features and extreme learning machine[J]. Microelectronics & Computer, 2020, 37(3): 37-41.
Citation: LUO Qing, BAO Ya-ping, YU Qiang. Voice activity detection based on improved speech features and extreme learning machine[J]. Microelectronics & Computer, 2020, 37(3): 37-41.

基于改进语音特征与极限学习机的语音端点检测

Voice activity detection based on improved speech features and extreme learning machine

  • 摘要: 语音端点检测(Voice Activity Detection, VAD),是指在给定语音信号帧中判别语音是否存在,鲁棒的VAD有助于提高语音应用的自动化效率,例如语音增强、说话人识别、助听器等.为了提高低信噪比下语音端点检测的精度以及效率,提出了一种新的语音特征—低频消噪能量(Low Frequency De-noising Energy,LFDE),将其应用于VAD中,并利用LFDE与现有的声学特征(梅尔频率倒谱参数、共振峰频率)结合训练极限学习机(Extreme Learning Machine, ELM)分类器.仿真实验发现,端点检测的精度与效率都有提高.

     

    Abstract: Voice Activity Detection (VAD) refers to the determination of the existence of speech in a given speech signal frame. Robust VAD helps to improve the automation efficiency of speech applications, such as speech enhancement, speaker recognition, and hearing aids and so on. In order to improve the accuracy and efficiency of voice activity detection under low SNR, a new speech feature—Low Frequency De-noising Energy (LFDE) is proposed, which is applied to VAD and utilizes LFDE and existing acoustic features (Mel frequency cepstrum parameters, formants Frequency) combined with the Extreme Learning Machine (ELM) classifier.Simulation experiments show that the accuracy and efficiency of voice activity detection are improved.

     

/

返回文章
返回