朱锡祥, 刘凤山, 张超, 吕钊, 吴小培. 基于一维卷积神经网络的车载语音识别研究[J]. 微电子学与计算机, 2017, 34(11): 21-25.
引用本文: 朱锡祥, 刘凤山, 张超, 吕钊, 吴小培. 基于一维卷积神经网络的车载语音识别研究[J]. 微电子学与计算机, 2017, 34(11): 21-25.
ZHU Xi-xiang, LIU Feng-shan, ZHANG Chao, LV Zhao, WU Xiao-pei. Research on in-car Speech Recognition Based on One Dimensional Convolutional Neural Networks[J]. Microelectronics & Computer, 2017, 34(11): 21-25.
Citation: ZHU Xi-xiang, LIU Feng-shan, ZHANG Chao, LV Zhao, WU Xiao-pei. Research on in-car Speech Recognition Based on One Dimensional Convolutional Neural Networks[J]. Microelectronics & Computer, 2017, 34(11): 21-25.

基于一维卷积神经网络的车载语音识别研究

Research on in-car Speech Recognition Based on One Dimensional Convolutional Neural Networks

  • 摘要: 针对卷积神经网络(convolution neural networks, CNNs)是二维结构, 不能很好地反映出语音信号的一维特性, 因此, 提出使用一维模型进行语音识别研究.其通过卷积核在时间轴上的移动, 在保留频带相关性的同时可以更好的满足语音信号的时变性, 进而提高识别性能.最后进行了车载语音识别对比实验, 结果表明一维卷积神经网络的识别率较二维卷积神经网络提高了约10%~20%, 在噪声环境下的泛化性能也明显优于后者.

     

    Abstract: Convolution neural networks(CNNs) has been the architecture of traditional convolution neural networks is two-dimensional(2D), which can not reflect the one-dimensional characteristic of speech signal. Therefore, a one-dimensional(1D)architecture for speech recognition was proposed, which can better satisfy the temporal variation while retaining band correlation by convolution along the time axis. Experiments of in-car speech recognition demonstrate that 1D CNNs can significantly outperform the 2D CNNs, with recognition rate improvement of 10% to 20%, and the generalization performance in noisy environment of 1D CNNs is also significantly better than the latter.

     

/

返回文章
返回