The openness of the web allows it to embrace and extend technologies that were previously only available in specialized commercial applications. One of the best examples of this is the new support of speech in browsers, in the form of both speech recognition and speech generation as native API’s; in this article, I’ll concentrate on the latter.
The Speech Synthesis API is supported in Chrome and Safari (both desktop and mobile), Opera and Firefox (with the correct flag enabled); it is currently listed as Under Consideration for Microsoft Edge.
His Master’s Voice
The Speech Synthesis API can be demonstrated with just two lines of JavaScript:
var dialogue = new SpeechSynthesisUtterance("Dave, I can see you’re upset");
window.speechSynthesis.speak(dialogue);
Of course, that’s not a terribly practical example. But if we have some text on a page in an <article>
tag:
<article>
<p>The next time you see the full moon high in the south, look carefully
at its right-hand edge and let your eye travel upward along the curve
of the disk…</p>
</article>
We can get a reference to this easily with JavaScript, stripping markup from the text using textContent
:
var sampleText = document.getElementsByTagName("article")[0].textContent;
Since the Speech Synthesis API is not yet available on every platform, it would be wise to check for support before attempting to use it. We can do so by checking for its presence in the window
object:
if ('speechSynthesis' in window) {
var bodyText = new SpeechSynthesisUtterance(sampleText);
window.speechSynthesis.speak(bodyText);
}
Uses and Variants
Because the Speech Synthesis API may use a web-based service by default (local services are also a possibility for offline use), it can take a while for it to recognize a change, meaning that the browser may continue to babble even after the tab has been closed. (This is one of the reasons I’ve limited recitation to the first paragraph of this introductory article.)
Naturally, the Speech Synthesis API provides the ability to pause recitation, together with a whole host of options, including the ability to pronounce content in different kinds of voices, and even translating text into a different language before reading it aloud.
Conclusion
While users who require accessibility affordances will likely have text-to-speech software, the existence of the API in modern browsers provides many more options for your site visitors. I’ll be exploring some of those possibilities in the next article.
Enjoy this piece? I invite you to follow me at twitter.com/dudleystorey to learn more.