Integrating time alignment and neural networks for high performance continuous speech recognition

Patrick Haffner; M. Franzini; Alexander Waibel

Страница публикации Публикация в OpenAlex

Аннотация: The authors describe two systems in which neural network classifiers are merged with dynamic programming (DP) time alignment methods to produce high-performance continuous speech recognizers. One system uses the connectionist Viterbi-training (CVT) procedure, in which a neural network with frame-level outputs is trained using guidance from a time alignment procedure. The other system uses multi-state time-delay neural networks (MS-TDNNs), in which embedded DP time alignment allows network training with only word-level external supervision. The CVT results on the, TI Digits are 99.1% word accuracy and 98.0% string accuracy. The MS-TDNNs are described in detail, with attention focused on their architecture, the training procedure, and results of applying the MS-TDNNs to continuous speaker-dependent alphabet recognition: on two speakers, word accuracy is respectively 97.5% and 89.7%.< >

Год издания: 1991

Авторы: Patrick Haffner, M. Franzini, Alexander Waibel

Ключевые слова: Speech Recognition and Synthesis, Speech and Audio Processing, Neural Networks and Applications

Показать дополнительные сведения

Будние дни	9:00–19:00
Суббота	9:00–17:00
Воскресенье	выходной день

Подразделения:

8:30–17:00 (обед 12:30–13:00), пн-пт

Контакты

Единый телефон	+7 (391) 291-25-74
Библиотека	+7 (391) 206-21-06
Издательство	+7 (391) 206-25-88
E-mail	bik [at] sfu-kras.ru
Адрес	пр. Свободный, 79/10

Библиотечно-издательский комплекс СФУ

Integrating time alignment and neural networks for high performance continuous speech recognition
статья

Integrating time alignment and neural networks for high performance continuous speech recognitionстатья

Integrating time alignment and neural networks for high performance continuous speech recognition
статья