Binding the Cognitive Services Android Speech SDK - Part 3 - Java 8 fun

In the first part of this post, I showed how to get started binding the Microsoft Cognitive Services speech API. In the second part I showed how to make the code look more C#-like. In this part, I’ll show how to use it and fix up a nasty issue with the Android compiler and using jars created with the latest versions of Java.

Using the SDK

To use the SDK, you will need an Android app. Create a new single-view Android app, and reference the SDK binding project. Then build the app and try to run it.

Then marvel, as your app spectacularly fails to compile with a really weird error message.

COMPILETODALVIK : Uncaught translation error : com.android.dx.cf.code.SimException: invalid opcode ba (invokedynamic requires --min-sdk-version >= 26)

WooHoo, invalid opcode ba. Ba indeed! What is this gibberish?

Well the issue comes down to Java versions. Android in the past only supported Java code up to version 7. They are now adding support for later versions but Xamarin doesn’t have this yet, and this is only available on newer versions of Android (>= 26). To make your code work on earlier versions and with Xamarin you have to do a thing called desugaring (yes, really), and this alters the Java bytecode to convert Java 8 bytecode to a version that is supported by Java 7.

At the moment there isn’t a nice IDE way to turn on desugaring, instead it has to be set inside the .csproj file of the client application. Open up the .csproj file for your newly created Android app inside VSCode (other editors are available, but hey - why would you), or by editing the file inside Visual Studio, and add the following to the default PropertyGroup:

<AndroidEnableDesugar>true</AndroidEnableDesugar>

Your app should now build without errors!

I have this working and compiling in the preview versions of Visual Studio on Windows at the time of writing cos that’s how I roll. If you are on stable and get weird errors then try with preview as I know support for this is being actively worked on.

If you do this on VS for Mac then you will get a crash at run-time. The workaround is documented here: https://github.com/xamarin/xamarin-android/pull/1973

Buiding an app using the SDK

To use the SDK you do need to sign up for the Speech service in Azure. Head to portal.azure.com and add a new Speech resource (at the time of writing this is in preview).

Searching for the speech resource in Azure

Once you have this, note down the endpoint from the Overview page. It will be a URL, and you will need the bit before .api.cognitive.microsoft.com. For example, if your endpoint is https://northeurope.api.cognitive.microsoft.com/sts/v1.0, then you will need northeurope. You will also need one of the two keys from the Keys page.

You can then create a SpeechFactory using these values:

var factory = SpeechFactory.FromSubscription(<SpeechApiKey>, <endpoint>);

Once you have a speech factory, you can create different recognizers - simple speech, a translator, or an intent recognizer using LUIS. To detect speech, handle the relevant events. You can see an example of using the TranslationRecognizer to convert English to spoken German in an example project in my GitHub repo.

Had a successful day. Created a #Xamarin binding for the @Azure #CognitiveServices Android speech SDK, and built a sample app that translates me voice into spoken German. pic.twitter.com/Bg4XDvhBjv
— Jim Bennett ☁️ (@jimbobbennett) August 31, 2018

In these three posts you have seen how to create a binding library for the Speech SDK aar, make the code more C#-like, then finally use it from a client app, working around a Java bytecode issue. You can check out my implementation and a sample at on GitHub. As always, the best source of information with much more depth is the java binding dos on docs.microsoft.com.

Let me know what you build with this SDK - my DMs are always open on Twitter.