[Home]
Environment
- Identify 456 unique words from the latest list of Amazon Alexa commands [link]
- Identify 441 unique words from the latest list of Google Assistnat commands [link]
- Google TTS [link]
- Amazon Alexa [link]
- Google STT [link]
Experiment Procedure
- Step 1. Randomly select 50 CVC-contained and 50 CVC-free words
- Step 2. Generate audio files at different playback speed (1.0 – 3.0x) using Google TTS (Format and naming of each audio file: [word]_waveNetM_[speed].mp3, e.g., everyone_waveNetM_2.0.mp3)
- Step 3. Transmit corresponding audio files over-the-wire to target ASR
(i.e., Amazon Alexa or Google STT)
- Step 4. Compute pass rate = number of the non-empty ASR recognition / total number of tested words
Codes Utilized for Experiment
- Translate words into their phonetic representation
- Identify CVC-contained and CVC-free words via phaoneme syllabification
Result
- Figure 3 in the paper presents the pass rates for CVC-contained and CVC-free words