DUAN Y,SHAO Y B,LIU J,et al. A language identification method based on normalization of pitch frequency[J]. Microelectronics & Computer,2023,40(5):20-28. doi: 10.19304/J.ISSN1000-7180.2022.0398
Citation: DUAN Y,SHAO Y B,LIU J,et al. A language identification method based on normalization of pitch frequency[J]. Microelectronics & Computer,2023,40(5):20-28. doi: 10.19304/J.ISSN1000-7180.2022.0398

A language identification method based on normalization of pitch frequency

  • To address the problem that speaker pronunciation features affect language identification and lead to poor recognition performance, a speech fundamental frequency normalization method is proposed. Firstly, the speech segments with and without speech are distinguished based on the endpoint detection, and the fundamental frequency is extracted from the speech segments and normalized to produce the voice-gated pulses. Then, we extract the vocal channel response, reconstruct the normalized speech with the fundamental frequency through the all-pole filter, and finally extract the underlying acoustic features for back-end language identification in the ResNet network. The experimental results show that the proposed method can reduce the influence of speaker pronunciation features on language differentiation features, and it is effective in gray-scale speech spectrograms, with a recognition rate of 94.3%. The recognition rate of the proposed method is improved by 3~4% for both the traditional underlying acoustic features such as MFCC and GFCC and the improved time-domain GF features. Effectively reduces the influence of speaker pronunciation features and improves language recognition performance.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return