adaptive-speech-recognizer

An adaptive dictation-mode speech recognizer ponyfill compatible with WebChat that gives the user time to think and stutter (stammer)!

Mastering 'endSilenceTimeoutMs' in Microsoft Speech SDK dictation mode!

(08-Oct-2020)

Basic usage

import 'ms-cognitive-speech-sdk';
import createAdaptiveRecognizerPonyfill from 'adaptive-speech-recognizer';

const ponyfill = createAdaptiveRecognizerPonyfill({
  subscriptionKey,
  region,
  endSilenceTimeoutMs
});

const recognizer = new ponyfill.SpeechRecognition();
recognizer.start();

Ponyfill

See Integrating with Cognitive Services Speech Services.

import { createAdaptiveRecognizerPonyfill } from 'adaptive-speech-recognizer';

const asrPonyfill = await createAdaptiveRecognizerPonyfill({ region, key });

// ... Combine speech synthesis from default
// 'createCognitiveServicesSpeechServicesPonyfillFactory()' ...

renderWebChat(
  {
    directLine: createDirectLine({ ... }),
    // ...
    webSpeechPonyfillFactory: await createCustomHybridPonyfill({ ... })
  },
  document.getElementById('webchat')
);

Dictation mode

The key lines in createCognitiveRecognizer to force dictation mode, and enable the setting of initialSilenceTimeoutMs and endSilenceTimeoutMs:

const initialSilenceTimeoutMs = 5 * 1000;
const endSilenceTimeoutMs = 5 * 1000;
// Scroll to right! → →
const url = `wss://${region}.stt.speech.microsoft.com/speech/recognition/dictation/cognitiveservices/v1?initialSilenceTimeoutMs=${initialSilenceTimeoutMs || ''}&endSilenceTimeoutMs=${endSilenceTimeoutMs}&`;
const urlObj = new URL(url);

const speechConfig = SpeechConfig.fromEndpoint(urlObj, subscriptionKey);

speechConfig.enableDictation();

// ...

const recognizer = new SpeechRecognizer(speechConfig, audioConfig);

Usage

npm install
npm start
npm test

Useful links

Credit

Developed in IET at The Open University for the ADMINS project, funded by Microsoft.

"ADMINS in IET: Assistants to the Disclosure and Management of Information about Needs and Support"

"Microsoft 'AI for Accessibility' projects, including ADMINS"

"PAS 901:2025 Vocal accessibility in system design. Code of practice"

"DOI: 10.3403/30458829"

"'createCognitiveRecognizer()' function, lines 527-540"

"Minimum/Maximum values for InitialSilence and EndSilence timeouts for java SDK (#502) (2020)"

Package detail