Pronunciation Variation Modeling for Dutch Automatic Speech Recognition


This article describes how the performance of a Dutch continuous speech recognizer was improved by modeling pronunciation variation. We propose a general procedure for modeling pronunciation variation. In short, it consists of adding pronunciation variants to the lexicon, retraining phone models and using language models to which the pronunciation variants have been added. First, within-word pronunciation variants were generated by applying a set of ®ve optional phonological rules to the words in the baseline lexicon. Next, a limited number of cross-word processes were modeled, using two di€erent methods. In the ®rst approach, cross-word processes were modeled by directly adding the cross-word variants to the lexicon, and in the second approach this was done by using multi-words. Finally, the combination of the within-word method with the two cross-word methods was tested. The word error rate (WER) measured for the baseline system was 12.75%. Compared to the baseline, a small but statistically signi®cant improvement of 0.68% in WER was measured for the within-word method, whereas both cross-word methods in isolation led to small, non-signi®cant improvements. The combination of the within-word method and cross-word method 2 led to the best result: an absolute improvement of 1.12% in WER was found compared to the baseline, which is a relative improvement of 8.8% in WER. Ó 1999 Elsevier Science B.V. All rights reserved.


30 Figures and Tables

Download Full PDF Version (Non-Commercial Use)