Speech in Silverlight by Anton

Creation of RIA is always something amazing and new. Adding ability to pronounce text for end user is one of those features. In this article you will learn methods and will find links to the most new ideas in this area. Lets move on…

On of the methods is COM-based objects in Silverlight. We can create a new dynamic object that will have ability to say words. We need to add Microsoft.CSharp reference to our project and add this, where you want to say some text:

dynamic textToSpeech = AutomationFactory.CreateObject("Sapi.SpVoice"); 
textToSpeech.Volume = 100; 
textToSpeech.Speak("Hello");

This method is using Automation.

Automation is technology based on Windows platform and provides access to registered servers of automation (Office, shell scripts). In our case, we are creating SAPI object. This is fast, but not flexible solution. Our application need to be trusted and work out-of-browser to have access to automation API (check “Require elevated trust when running out of browser” in project settings). User have to install it to lunch. You can read more here: MSDNand Tutorial.

If you don’t want to make out-of-browser application and you want to make it work on every platform, Microsoft suggesting to us a Text-to-Speech technique. It’s a client-server solution. We are creating a Silverlight application with WCF service on server side.

The main idea is described in these steps: 1. Calling a Web Service to process the text. 2. Using Microsoft’s Text to Speech API to convert the text to WAV. 3. Decoding the byte array with a WAV decoding class. 4. Playing back the WAV stream with Silverlight’s MediaElement.

It uses SpeechSynthesizer class to create a byte array for the speech and WAVMediaStreamSource in Silverlight to decode it to .wav. This class is inherited from System.Windows.Media.MediaStreamSource. The whole example with sources is available at MSDN.

Before discussing more general methods you need to know some pros and cons of techiques described above.

Text-to-Speech

“+” : 1. Works with any browser that is officially supported by Silverlight
2. Works on operating systems supported by Silverlight

“-” : 1. Text-to-Speech decoding happens on the server-side 2. Web service needs to be implemented 3. Works well for short sentences; for longer sentences, you need to break-up sentences into smaller pieces of text and then process 4. Need good connection speed for normal translation

Microsoft Speech API (SAPI)

“+” : 1. Text-to-Speech decoding happens on the client-side 2. No implementation of Web services thus avoiding a Web server dependency 3. Long sentences could be processed much easily and faster on the client-side

“-” : 1. Works only on Windows platform 2. Might need to change security settings in Internet Explorer to allow initializing and scripting of ActiveX controls or work out-of-browser

If you want to use independent method, you can use Bing translator service for translation. Just create an account on Bing Developers page, enter the website url where this application will be deployed and use it.

string appId = "myAppId"; string text = "Speak this for me"; string language = "en"; string detectUri = "http://api.microsofttranslator.com/v2/Http.svc/Speak?appId=" + appId + "&text;=" + text + "&language;=" + language; HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(detectUri); WebResponse resp = httpWebRequest.GetResponse(); Stream strm = resp.GetResponseStream();

Simple http-request. 7 languages available for translations, so you can translate text automatically and pronounce it. Don’t forget to use WAVMediaStreamSource to convert response to wav format.

Here is link to Microsoft translator platform: http://www.microsofttranslator.com/mix2010/