基于一维卷积神经网络的车载语音识别研究
Research on in-car Speech Recognition Based on One Dimensional Convolutional Neural Networks
-
摘要: 针对卷积神经网络(convolution neural networks, CNNs)是二维结构, 不能很好地反映出语音信号的一维特性, 因此, 提出使用一维模型进行语音识别研究.其通过卷积核在时间轴上的移动, 在保留频带相关性的同时可以更好的满足语音信号的时变性, 进而提高识别性能.最后进行了车载语音识别对比实验, 结果表明一维卷积神经网络的识别率较二维卷积神经网络提高了约10%~20%, 在噪声环境下的泛化性能也明显优于后者.Abstract: Convolution neural networks(CNNs) has been the architecture of traditional convolution neural networks is two-dimensional(2D), which can not reflect the one-dimensional characteristic of speech signal. Therefore, a one-dimensional(1D)architecture for speech recognition was proposed, which can better satisfy the temporal variation while retaining band correlation by convolution along the time axis. Experiments of in-car speech recognition demonstrate that 1D CNNs can significantly outperform the 2D CNNs, with recognition rate improvement of 10% to 20%, and the generalization performance in noisy environment of 1D CNNs is also significantly better than the latter.