1 page.title=Using Text-to-Speech 2 @jd:body 3 4 <p>Starting with Android 1.6 (API Level 4), the Android platform includes a new 5 Text-to-Speech (TTS) capability. Also known as "speech synthesis", TTS enables 6 your Android device to "speak" text of different languages.</p> 7 8 <p>Before we explain how to use the TTS API itself, let's first review a few 9 aspects of the engine that will be important to your TTS-enabled application. We 10 will then show how to make your Android application talk and how to configure 11 the way it speaks.</p> 12 13 <h3>Languages and resources</h3> 14 15 <p>The TTS engine that ships with the Android platform supports a number of 16 languages: English, French, German, Italian and Spanish. Also, depending on 17 which side of the Atlantic you are on, American and British accents for English 18 are both supported.</p> 19 20 <p>The TTS engine needs to know which language to speak, as a word like "Paris", 21 for example, is pronounced differently in French and English. So the voice and 22 dictionary are language-specific resources that need to be loaded before the 23 engine can start to speak.</p> 24 25 <p>Although all Android-powered devices that support the TTS functionality ship 26 with the engine, some devices have limited storage and may lack the 27 language-specific resource files. If a user wants to install those resources, 28 the TTS API enables an application to query the platform for the availability of 29 language files and can initiate their download and installation. So upon 30 creating your activity, a good first step is to check for the presence of the 31 TTS resources with the corresponding intent:</p> 32 33 <pre>Intent checkIntent = new Intent(); 34 checkIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA); 35 startActivityForResult(checkIntent, MY_DATA_CHECK_CODE);</pre> 36 37 <p>A successful check will be marked by a <code>CHECK_VOICE_DATA_PASS</code> 38 result code, indicating this device is ready to speak, after the creation of 39 our 40 {@link android.speech.tts.TextToSpeech} object. If not, we need to let the user 41 know to install the data that's required for the device to become a 42 multi-lingual talking machine! Downloading and installing the data is 43 accomplished by firing off the ACTION_INSTALL_TTS_DATA intent, which will take 44 the user to Android Market, and will let her/him initiate the download. 45 Installation of the data will happen automatically once the download completes. 46 Here is an example of what your implementation of 47 <code>onActivityResult()</code> would look like:</p> 48 49 <pre>private TextToSpeech mTts; 50 protected void onActivityResult( 51 int requestCode, int resultCode, Intent data) { 52 if (requestCode == MY_DATA_CHECK_CODE) { 53 if (resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS) { 54 // success, create the TTS instance 55 mTts = new TextToSpeech(this, this); 56 } else { 57 // missing data, install it 58 Intent installIntent = new Intent(); 59 installIntent.setAction( 60 TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA); 61 startActivity(installIntent); 62 } 63 } 64 }</pre> 65 66 <p>In the constructor of the <code>TextToSpeech</code> instance we pass a 67 reference to the <code>Context</code> to be used (here the current Activity), 68 and to an <code>OnInitListener</code> (here our Activity as well). This listener 69 enables our application to be notified when the Text-To-Speech engine is fully 70 loaded, so we can start configuring it and using it.</p> 71 72 <h4>Languages and Locale</h4> 73 74 <p>At Google I/O 2009, we showed an <a title="Google I/O 2009, TTS 75 demonstration" href="http://www.youtube.com/watch?v=uX9nt8Cpdqg#t=6m17s" 76 id="rnfd">example of TTS</a> where it was used to speak the result of a 77 translation from and to one of the 5 languages the Android TTS engine currently 78 supports. Loading a language is as simple as calling for instance:</p> 79 80 <pre>mTts.setLanguage(Locale.US);</pre><p>to load and set the language to 81 English, as spoken in the country "US". A locale is the preferred way to specify 82 a language because it accounts for the fact that the same language can vary from 83 one country to another. To query whether a specific Locale is supported, you can 84 use <code>isLanguageAvailable()</code>, which returns the level of support for 85 the given Locale. For instance the calls:</p> 86 87 <pre>mTts.isLanguageAvailable(Locale.UK)) 88 mTts.isLanguageAvailable(Locale.FRANCE)) 89 mTts.isLanguageAvailable(new Locale("spa", "ESP")))</pre> 90 91 <p>will return TextToSpeech.LANG_COUNTRY_AVAILABLE to indicate that the language 92 AND country as described by the Locale parameter are supported (and the data is 93 correctly installed). But the calls:</p> 94 95 <pre>mTts.isLanguageAvailable(Locale.CANADA_FRENCH)) 96 mTts.isLanguageAvailable(new Locale("spa"))</pre> 97 98 <p>will return <code>TextToSpeech.LANG_AVAILABLE</code>. In the first example, 99 French is supported, but not the given country. And in the second, only the 100 language was specified for the Locale, so that's what the match was made on.</p> 101 102 <p>Also note that besides the <code>ACTION_CHECK_TTS_DATA</code> intent to check 103 the availability of the TTS data, you can also use 104 <code>isLanguageAvailable()</code> once you have created your 105 <code>TextToSpeech</code> instance, which will return 106 <code>TextToSpeech.LANG_MISSING_DATA</code> if the required resources are not 107 installed for the queried language.</p> 108 109 <p>Making the engine speak an Italian string while the engine is set to the 110 French language will produce some pretty <i>interesting </i>results, but it will 111 not exactly be something your user would understand So try to match the 112 language of your application's content and the language that you loaded in your 113 <code>TextToSpeech</code> instance. Also if you are using 114 <code>Locale.getDefault()</code> to query the current Locale, make sure that at 115 least the default language is supported.</p> 116 117 <h3>Making your application speak</h3> 118 119 <p>Now that our <code>TextToSpeech</code> instance is properly initialized and 120 configured, we can start to make your application speak. The simplest way to do 121 so is to use the <code>speak()</code> method. Let's iterate on the following 122 example to make a talking alarm clock:</p> 123 124 <pre>String myText1 = "Did you sleep well?"; 125 String myText2 = "I hope so, because it's time to wake up."; 126 mTts.speak(myText1, TextToSpeech.QUEUE_FLUSH, null); 127 mTts.speak(myText2, TextToSpeech.QUEUE_ADD, null);</pre> 128 129 <p>The TTS engine manages a global queue of all the entries to synthesize, which 130 are also known as "utterances". Each <code>TextToSpeech</code> instance can 131 manage its own queue in order to control which utterance will interrupt the 132 current one and which one is simply queued. Here the first <code>speak()</code> 133 request would interrupt whatever was currently being synthesized: the queue is 134 flushed and the new utterance is queued, which places it at the head of the 135 queue. The second utterance is queued and will be played after 136 <code>myText1</code> has completed.</p> 137 138 <h4>Using optional parameters to change the playback stream type</h4> 139 140 <p>On Android, each audio stream that is played is associated with one stream 141 type, as defined in 142 {@link android.media.AudioManager android.media.AudioManager}. For a talking 143 alarm clock, we would like our text to be played on the 144 <code>AudioManager.STREAM_ALARM</code> stream type so that it respects the alarm 145 settings the user has chosen on the device. The last parameter of the speak() 146 method allows you to pass to the TTS engine optional parameters, specified as 147 key/value pairs in a HashMap. Let's use that mechanism to change the stream type 148 of our utterances:</p> 149 150 <pre>HashMap<String, String> myHashAlarm = new HashMap(); 151 myHashAlarm.put(TextToSpeech.Engine.KEY_PARAM_STREAM, 152 String.valueOf(AudioManager.STREAM_ALARM)); 153 mTts.speak(myText1, TextToSpeech.QUEUE_FLUSH, myHashAlarm); 154 mTts.speak(myText2, TextToSpeech.QUEUE_ADD, myHashAlarm);</pre> 155 156 <h4>Using optional parameters for playback completion callbacks</h4> 157 158 <p>Note that <code>speak()</code> calls are asynchronous, so they will return 159 well before the text is done being synthesized and played by Android, regardless 160 of the use of <code>QUEUE_FLUSH</code> or <code>QUEUE_ADD</code>. But you might 161 need to know when a particular utterance is done playing. For instance you might 162 want to start playing an annoying music after <code>myText2</code> has finished 163 synthesizing (remember, we're trying to wake up the user). We will again use an 164 optional parameter, this time to tag our utterance as one we want to identify. 165 We also need to make sure our activity implements the 166 <code>TextToSpeech.OnUtteranceCompletedListener</code> interface:</p> 167 168 <pre>mTts.setOnUtteranceCompletedListener(this); 169 myHashAlarm.put(TextToSpeech.Engine.KEY_PARAM_STREAM, 170 String.valueOf(AudioManager.STREAM_ALARM)); 171 mTts.speak(myText1, TextToSpeech.QUEUE_FLUSH, myHashAlarm); 172 myHashAlarm.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, 173 "end of wakeup message ID"); 174 // myHashAlarm now contains two optional parameters 175 mTts.speak(myText2, TextToSpeech.QUEUE_ADD, myHashAlarm);</pre> 176 177 <p>And the Activity gets notified of the completion in the implementation 178 of the listener:</p> 179 180 <pre>public void onUtteranceCompleted(String uttId) { 181 if (uttId == "end of wakeup message ID") { 182 playAnnoyingMusic(); 183 } 184 }</pre> 185 186 <h4>File rendering and playback</h4> 187 188 <p>While the <code>speak()</code> method is used to make Android speak the text 189 right away, there are cases where you would want the result of the synthesis to 190 be recorded in an audio file instead. This would be the case if, for instance, 191 there is text your application will speak often; you could avoid the synthesis 192 CPU-overhead by rendering only once to a file, and then playing back that audio 193 file whenever needed. Just like for <code>speak()</code>, you can use an 194 optional utterance identifier to be notified on the completion of the synthesis 195 to the file:</p> 196 197 <pre>HashMap<String, String> myHashRender = new HashMap(); 198 String wakeUpText = "Are you up yet?"; 199 String destFileName = "/sdcard/myAppCache/wakeUp.wav"; 200 myHashRender.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, wakeUpText); 201 mTts.synthesizeToFile(wakuUpText, myHashRender, destFileName);</pre> 202 203 <p>Once you are notified of the synthesis completion, you can play the output 204 file just like any other audio resource with 205 {@link android.media.MediaPlayer android.media.MediaPlayer}.</p> 206 207 <p>But the <code>TextToSpeech</code> class offers other ways of associating 208 audio resources with speech. So at this point we have a WAV file that contains 209 the result of the synthesis of "Wake up" in the previously selected language. We 210 can tell our TTS instance to associate the contents of the string "Wake up" with 211 an audio resource, which can be accessed through its path, or through the 212 package it's in, and its resource ID, using one of the two 213 <code>addSpeech()</code> methods:</p> 214 215 <pre>mTts.addSpeech(wakeUpText, destFileName);</pre> 216 217 <p>This way any call to speak() for the same string content as 218 <code>wakeUpText</code> will result in the playback of 219 <code>destFileName</code>. If the file is missing, then speak will behave as if 220 the audio file wasn't there, and will synthesize and play the given string. But 221 you can also take advantage of that feature to provide an option to the user to 222 customize how "Wake up" sounds, by recording their own version if they choose 223 to. Regardless of where that audio file comes from, you can still use the same 224 line in your Activity code to ask repeatedly "Are you up yet?":</p> 225 226 <pre>mTts.speak(wakeUpText, TextToSpeech.QUEUE_ADD, myHashAlarm);</pre> 227 228 <h4>When not in use...</h4><p>The text-to-speech functionality relies on a 229 dedicated service shared across all applications that use that feature. When you 230 are done using TTS, be a good citizen and tell it "you won't be needing its 231 services anymore" by calling <code>mTts.shutdown()</code>, in your Activity 232 <code>onDestroy()</code> method for instance.</p> 233 234 <h3>Conclusion</h3> 235 236 <p>Android now talks, and so can your apps. Remember that in order for 237 synthesized speech to be intelligible, you need to match the language you select 238 to that of the text to synthesize. Text-to-speech can help you push your app in 239 new directions. Whether you use TTS to help users with disabilities, to enable 240 the use of your application while looking away from the screen, or simply to 241 make it cool, we hope you'll enjoy this new feature.</p>