Microsoft S Fastspeech Vastly Improves Text To Speech Technology
Under current models, such as text-to-speech used in Cortana, technology can only create snippets of humanlike voice. There are also some limitations, such as skipping words in synthesized speech. That’s because current models have slower mel-spectrogram generation. If you’re unfamiliar with mel-spectrogram, it’s the representation of power made by a sound. Microsoft’s FastSpeech aims to solve how mel-spectrogram performance. Described in a paper, “FastSpeech: Fast, Robust and Controllable Text to Speech”, the technology boasts a specialized architecture....