Use the chrome.experimental.tts
module to play synthesized
text-to-speech (TTS) from your extension or packaged app, or to register
as a speech provider for other extensions and packaged apps that want to speak.
Give us feedback: If you have suggestions, especially changes that should be made before stabilizing the first version of this API, please send your ideas to the chromium-extensions group.
To enable this experimental API, visit chrome://flags and enable Experimental Extension APIs.
Chrome provides native support for speech on Windows (using SAPI 5), Mac OS X, and Chrome OS, using speech synthesis capabilities provided by the operating system. On all platforms, the user can install extensions that register themselves as alternative speech synthesis providers.
Call speak()
from your extension or
packaged app to speak. For example:
chrome.experimental.tts.speak('Hello, world.');
You can provide options that control various properties of the speech, such as its rate, pitch, and more. For example:
chrome.experimental.tts.speak('Hello, world.', {'rate': 0.8});
It's also a good idea to specify the locale so that a synthesizer supporting that language (and regional dialect, if applicable) is chosen.
chrome.experimental.tts.speak( 'Hello, world.', { 'locale': 'en-US', 'rate': 0.8 });
Not all speech engines will support all options.
You can also pass a callback function that will be called when the speech has finished. For example, suppose we have an image on our page displaying a picture of a face with a closed mouth. We could open the mouth while speaking, and close it when done.
faceImage.src = 'open_mouth.png'; chrome.experimental.tts.speak( 'Hello, world.', null, function() { faceImage.src = 'closed_mouth.png'; });
To stop speaking immediately, just call stop()
. Call
isSpeaking()
to find out if a TTS engine is currently speaking.
You can check to see if an error occurred by checking
chrome.extension.lastError
inside the callback function.
Utterances used in this API may include markup using the Speech Synthesis Markup Language (SSML). For example:
chrome.experimental.tts.speak('The <emphasis>second</emphasis> word of this sentence was emphasized.');
Not all speech engines will support all SSML tags, and some may not support SSML at all, but all engines are expected to ignore any SSML they don't support and still speak the underlying text.
An extension can register itself as a speech provider. By doing so, it
can intercept some or all calls to functions such as
speak()
and stop()
and provide an alternate
implementation. Extensions are free to use any available web technology
to provide speech, including streaming audio from a server, HTML5 audio,
Native Client, or Flash. An extension could even do something different
with the utterances, like display closed captions in a pop-up window or
send them as log messages to a remote server.
To provide TTS, an extension must first declare all voices it provides in the extension manifest, like this:
{ "name": "My TTS Provider", "version": "1.0", "permissions": ["experimental"] "tts": { "voices": [ { "voiceName": "Alice", "locale": "en-US", "gender": "female" }, { "voiceName": "Pat", "locale": "en-US" } ] }, "background_page": "background.html", }
An extension can specify any number of voices. The three
parameters—voiceName
, locale
,
and gender
—are all optional. If they are all unspecified,
the extension will handle all speech from all clients. If any of them
are specified, they can be used to filter speech requests. For
example, if a voice only supports French, it should set the locale to
'fr' (or something more specific like 'fr-FR') so that only utterances
in that locale are routed to that extension.
To handle speech calls, the extension should register listeners
for onSpeak
and onStop
, like this:
var speakListener = function(utterance, options, callback) { ... callback(); }; var stopListener = function() { ... }; chrome.experimental.tts.onSpeak.addListener(speakListener); chrome.experimental.tts.onStop.addListener(stopListener);
Important: Don't forget to call the callback function from your speak listener!
If an extension does not register listeners for both
onSpeak
and onStop
, it will not intercept any
speech calls, regardless of what is in the manifest.
The decision of whether or not to send a given speech request to an
extension is based solely on whether the extension supports the given voice
parameters in its manifest and has registered listeners
for onSpeak
and onStop
. In other words,
there's no way for an extension to receive a speech request and
dynamically decide whether to handle it or not.