Nuance Translator SDK

Nuance Translator SDK is a platform for adding Speech recognition and Speech synthesis (text-to-speech) capabilities to your mobile applications.

Translator SDK is integrated to NuanceMessaging UI framework to provide ASR/TTS functionality to the messaging window.

This guide covers the API's that can be used to add ASR/TTS functionality to NuanceMessaging UI Framework

  1. Initialing Translator
  2. NinaSetting
  3. SynthesisValues
  4. RecognitionValues
  5. SessionValues
  6. Translator API

You can initialize the Translator SDK by passing an instance of NinaServerConfiguration class.

NinaServerConfiguration Properties

Following properties lets you configure the server configuration.

NinaSetting class allows applicaiton to setup ASR/TTS configuration params.

Following properties are exposed from NinaSetting Class

Properties

SynthesisValues class allows applicaiton to update the default parameters that are used by TTS engine.

Following properties are exposed from SynthesisValues Class

Properties
  • NSString *parameterType

    The type of data you provided.if none provided, the default will be taken from the customer config file.

    Default value: text

    Allowed values:"text","ssml"

  • BOOL statistics

    Whether or not to return statistics on the command's performance:"true","false"

    Default value: false

  • NSString *voice

    The voice in which speech synthesis is to be done. if none provided, the default will be taken from your customer config file. Note this parameter works only with the type "text".

    Default value: "Carol"

RecognitionValues class allows applicaiton to update the default parameters that are used by ASR engine.

Following properties are exposed from RecognitionValues Class

Properties
  • NSString *speechDetector

    The speech detector to be used for voice activity detection. If none provided, the default will be taken from your customer config file.

    Default value: legacy

    Allowed values: "legacy", "adaptive"

  • int beginNoiseSampleFrames

    The number of frames taken at the start of recognition to assess the environment's noise level. If none provided, the default will be taken from your customer config file.

    Default value: 10

    Size range: 1..231-1

  • double voiceThreshold

    The amount of energy require for a frame to be considered 'voiced', as a factor of the background noise level. If none provided, the default will be taken from your customer config file.
    Default value: 4.0
    Size range: -263..263-1

  • int startOfSpeechVoicedFrames

    Of the most recent startOfSpeechHistoryFrames, the number of 'voiced' frames required to assess that user has started speaking. If none provided, the default will be taken from your customer config file.
    Default value: 7
    Size range: -231..231-1

  • int startOfSpeechHistoryFrames

    The number of historical frames to use when assessing that the user has started speaking. If none provided, the default will be taken from your customer config file.
    Default value: 15
    Size range: -231..231-1

  • int endOfSpeechVoicedFrames

    Of the most recent endOfSpeechHistoryFrames, the number of 'voiced' frames which will indicate the user has not finished speaking. If none provided, the default will be taken from your customer config file.
    Default value: 5
    Size range: -231..231-1

  • int endOfSpeechHistoryFrames

    The number of historical frames to use when assessing that the user has finished speaking. If none provided, the default will be taken from your customer config file.
    Default value: 50
    Size range: -231..231-1

  • BOOL considerNegativeRatios

    Whether or not to consider negative ratios when assessing speech activity. If none provided, the default will be taken from your customer config file.
    Default value: false

  • BOOL stopOnEndOfSpeech

    Whether or not to perform end-point detection. If none provided, the default will be taken from your customer config file.
    Default value: true

  • BOOL wordStream

    Whether or not to send intermediate recognition results. If none provided, the default will be taken from your customer config file.
    Default value: true

  • NSArray *activeDynamicVocabularySets

    The list of dynamic vocabulary (defined by their ids) to activate for this speech recognition command. NOTE: Do not upload more then 500 entries as this may have a noticeable performance degradation

  • int startOfSpeechTimeoutSeconds

    How long to wait for start-of-speech before the server cancels a recognition, in seconds. Not applicable if endPointDetection is false. Use 0 (zero) for no limit.
    Default value: 5
    Size range: 0..231-1

  • int utteranceMaxTimeSeconds

    The maximum allowed time for an utterance, in seconds. When endPointDetection is true, this time begins when start-of-speech is detected. Use 0 (zero) for no limit.
    Default value: 30
    Size range: 0..231-1

  • BOOL statistics

    Whether or not to return statistics on the command's performance.
    Default value: true

  • BOOL isNoiseLevelRequired

    The specific background noise level energy values to use for speech detection
    Default value: false

SessionValues class allows applicaiton to update the default parameters that are used to intialize a ASR/TTS session.

Following properties are exposed from SessionValues Class

Properties
Styling Translator Persona

Properties to customize Translator persona is exposed from NinaCobraProperties class. Application has to pass an instance of this class having modified properties into the framework.

Following properties in NinaCobraProperties lets you change the look and feel of Translator persona.

Name type default description
successChimeSoundData NSData nil

You need to pass the sound data that gets played when SDK successfully recorded user text.

connectionSuccessChimeSoundData NSData nil

You need to pass the sound data that gets played when SDK established a session with the ASR/TTS server.

connectionFailedChimeSoundData NSData nil

You need to pass the sound data that gets played when SDK fails to establish a connection with ASR/TTS server or when unexpectedly disconnect.

errorChimeSoundData NSData nil

You need to pass the sound data that gets played when SDK fails to recorded user text.

ninaProcessingViewBackgroundColor UIColor UIColor.gray

Lets you configure the background color of processing view.

ninaProcessingViewLeftMargin Int 3

Lets you configure the processing view left margin.

ninaProcessingViewRightMargin Int 0

Lets you configure the processing view right margin.

ninaProcessingViewTopMargin Int 4

Lets you configure the processing view top margin.

ninaProcessingViewBottomMargin Int 2

Lets you configure the processing view bottom margin.

ninaProcessingText String Processing...

Lets you configure the processing view label text.

ninaProcessingTextColor UIColor UIColor.white

Lets you configure the processing view label text color.

ninaProcessingTextBackgroundColor UIColor UIColor.clear

Lets you configure the processing view label background color.

ninaProcessingTextFontStyle UIFont UIFont(name: kFontName, size: 15.0)

Lets you configure the processing view label font style.

ninaProcessingTextAllignment NSTextAlignment NSTextAlignment.center

Lets you configure the processing view label text alignment.

ninaProcessingTextLeftMargin Int 12

Lets you configure the processing view label text left margin.

ninaProcessingTextRightMargin Int 63

Lets you configure the processing view label text right margin.

ninaProcessingTextTopMargin Int 10

Lets you configure the processing view label text top margin.

ninaProcessingTextBottomMargin Int 10

Lets you configure the processing view label text bottom margin.

ninaSleepingAnimationImage UIImage nil

You must pass the name of image sequences that gets played when translator is sleeping. Please check the example for sample images

ninaProcessingAnimationImage UIImage nil

You must pass the name of image sequences that gets played when translator is processing. Please check the example for sample images

ninaAlertAnimationImage UIImage nil

You must pass the name of image sequences that gets played when translator encountered an error while recording. Please check the example for sample images

ninaSucessAnimationImage UIImage nil

You must pass the name of image sequences that gets played when translator recongnised and parsed user voice. Please check the example for sample images

ninaListeningAnimationImage UIImage nil

You must pass the name of image sequences that gets played when translator is recording user voice. Please check the example for sample images

ninaRecordingEnergyPromptImages UIImage nil

You must pass the name of image sequences that gets played to show user's speaking energy. Please check the example for sample images

ninaAnimCenterBackgroundImage UIImage nil

You can set an image that get displayed at the center of translator persona providedly other animation images are transparent. Please check the example for sample images

ninaIdleStaticImage UIImage nil

You can set an image that get displayed when the translator is in between all above states. Please check the example for sample images

Following properties in lets you change the layout of Translator persona to center above text input.

Name type default description
showNinaAnimInCenter Bool false

Lets you move the persona image to center above text input

ninaAnimCenterHolderHeight Int 40

Lets you set the height of view that holds persona image

ninaAnimCenterHolderColor UIColor UIColor.gray

Lets you set the color of view that holds persona image

ninaCenterViewBackgroundColor UIColor UIColor.gray

Lets you set the background color of container view that holds persona image and holder

Following properties in lets you change the style of Translator persona Overlay.

Name type default description
ninaNoInternetText String No Internet

Lets you change the string displayed in overlay when there is no internet

ninaOverlayTextColor UIColor UIColor.white

Lets you change the overlay text color

ninaOverlayBackgroundColor UIColor UIColor(red: 85.0/255.0, green: 85.0/255.0, blue: 85.0/255.0, alpha: 0.6)

Lets you change the overlay background color

ninaOverlayOpacity Float .6

Lets you change the overlay opacity

ninaOverlayFontStyle UIFont UIFont(name: kFontName, size: 6.5)!

Lets you change the overlay text font style

ninaNoInternetImage UIImage nil

Lets you set an image instead of text to show no internet

Following properties in lets you over ride the default ASR/TTS behaviour.

Name type default description
isAsrTTSRequired Bool false

Lets you turn on ASR/TTS functionality in SDK messaging screen.

enableContinousListening Bool false

Lets you turn on continuous listening mode.

playTTSOnRestore Bool false

Lets you configure how agent messages are read upon restore, by default it doesn't read any messages. App can configure to read all messages upon restore.

playTTSOnlyOnASRInput Bool false

Lets you configure how agent messages are read. By default agent messages are read alwasys. App can configure to read only if Speech Recongnition is used

playOpener Bool true

Lets you configure if you would like to play opener

Configuring WakeUp Word

WakeUp Word functionality lets customer to wake up translator engine and make the engine to immediately starts listening.

Wake Up funtionality is not part of the regular Nuance NDEP SDK release. Upon request Nuance will release a special build that has wake up work functionality added. In addition to that you will have to include iOSWakeupWord.framework and language pack files into your workspace.

Dependency not supported in M1 machines. It's removed from cocoapods package since v10.0.7. To use the wakeup word dependency download v10.0.6, refer Getting the SDK as a zip from CocaPods.

Following properties in NinaCobraProperties lets you configure the wakeup work functionality.

Name type default description
ninaWakeupWordRequired Bool false

Lets you turn on wake up work functionality.

ninaWakeupWordAcmodFilename String ""

Pass the name of acmod .dat file which is added in your application source code.

ninaWakeupWordClcFilename String ""

Pass the name of clc .dat file which is added in your application source code.

ninaWakeupWordArray [String] null

Lets you configure the wakeup words that engine will listen to.

Translator API

SDK exposes API which allows Application to programatically control the state of Translator .

Following API methods from NinaMobileController.getInstance() lets application to registar listeners for various Translator events.

Following API methods from NinaMobileController.getInstance() lets application to trigger Translator functionality.


	
	     let serverConfiguration = NinaServerConfiguration.init()
        serverConfiguration.applicationKey = a == 0 ? "companyName_appName" : "NuancePS_PizzaAU"
        serverConfiguration.verificationCode = "5796d9e101d5355f5dbf95a3681f6ca5317fd9e96e5ffacac5e1305361a4eb1c"
        serverConfiguration.gateWayAddress = a == 0 ?"webapi-demo.nuance.mobi": "n4a-webapi.nuance.mobi" // webapi-demo.nuance.mobi // cobra-webapi.nuance.mobi
        serverConfiguration.gateWayPath = "webapi-platform/websocket"
        serverConfiguration.gateWayPort = "443"
        serverConfiguration.gateWayScheme = "wss"
        vc.setNinaServerConfigurationProperties(properties: serverConfiguration)

        let ninaSetting = NinaSetting.init()
        ninaSetting.sessionValues.speechRecognitionCodec = "pcm_16_16k"
        ninaSetting.sessionValues.speechSynthesisCodec = "pcm_16_16k"
        ninaSetting.synthesisValues.voice = a == 0 ? "Tom" : "Nathan"
        ninaSetting.synthesisValues.parameterType = "ssml"
        ninaSetting.serverTimeOut = 1.0

        
        vc.setNinaSettingProperties(properties: ninaSetting)

Copyright © 2023 Nuance Communications, Inc. All rights reserved.