Nuance Translator SDK

Nuance Translator SDK is a platform for adding Speech recognition and Speech synthesis (text-to-speech) capabilities to your mobile applications.

Translator SDK is integrated to NuanceMessaging UI framework to provide ASR/TTS functionality to the messaging window.

This guide covers the API's that can be used to add ASR/TTS functionality to NuanceMessaging UI Framework

Initialing Translator
NinaSetting
SynthesisValues
RecognitionValues
SessionValues
Translator API

You can initialize the Translator SDK by passing an instance of NinaServerConfiguration class.

NuanceMessagingViewController

Pass the instance to the following method exposed from NuanceMessagingViewController

public func setNinaServerConfigurationProperties(properties: NinaServerConfiguration)

NinaServerConfiguration Properties

Following properties lets you configure the server configuration.

NSString *applicationKey

The required name of the application that will map to the server configuration, in the format CompanyName_AppName, for example "JavaBeanz_Orders"

NSString *authenticationType

The type of authentication. Allowd values:"jws","hashcode"

NSString *authenticationdata

Authentication data based on your authentication method.

NSString *verificationCode

The required verification code if using "verification code" as an authentication method.

NSString *gateWayScheme

Sets the scheme of the server address used for processing the speech and text.
Default value: "wss"

NSString *gateWayAddress

Sets the address of the server used for processing the speech and text.

NSString *gateWayPort

Sets the port of the server used for processing the speech and text.
Default value: "443"

NSString *gateWayPath

Sets the path of the server address used for processing the speech and text.
Default value: "webapi-platform/websocket"

NinaSetting class allows applicaiton to setup ASR/TTS configuration params.

NuanceMessagingViewController

Pass the instance to the following method exposed from NuanceMessagingViewController

public func setNinaSettingProperties(properties: NinaSetting)

Following properties are exposed from NinaSetting Class

Properties

SynthesisValues * synthesisValues

An instance of SynthesisValues object.

RecognitionValues *recognitionValues

An instance of RecognitionValues object.

SessionValues * sessionValues

An instance of SessionValues object.

SynthesisValues class allows applicaiton to update the default parameters that are used by TTS engine.

Following properties are exposed from SynthesisValues Class

Properties

NSString *parameterType

The type of data you provided.if none provided, the default will be taken from the customer config file.
Default value: text

Allowed values:"text","ssml"

BOOL statistics

Whether or not to return statistics on the command's performance:"true","false"
Default value: false

NSString *voice

The voice in which speech synthesis is to be done. if none provided, the default will be taken from your customer config file. Note this parameter works only with the type "text".
Default value: "Carol"

RecognitionValues class allows applicaiton to update the default parameters that are used by ASR engine.

Following properties are exposed from RecognitionValues Class

Properties

NSString *speechDetector

The speech detector to be used for voice activity detection. If none provided, the default will be taken from your customer config file.
Default value: legacy

Allowed values: "legacy", "adaptive"

int beginNoiseSampleFrames

The number of frames taken at the start of recognition to assess the environment's noise level. If none provided, the default will be taken from your customer config file.
Default value: 10

Size range: 1..231-1

double voiceThreshold

The amount of energy require for a frame to be considered 'voiced', as a factor of the background noise level. If none provided, the default will be taken from your customer config file.
Default value: 4.0
Size range: -263..263-1

int startOfSpeechVoicedFrames

Of the most recent startOfSpeechHistoryFrames, the number of 'voiced' frames required to assess that user has started speaking. If none provided, the default will be taken from your customer config file.
Default value: 7
Size range: -231..231-1

int startOfSpeechHistoryFrames

The number of historical frames to use when assessing that the user has started speaking. If none provided, the default will be taken from your customer config file.
Default value: 15
Size range: -231..231-1

int endOfSpeechVoicedFrames

Of the most recent endOfSpeechHistoryFrames, the number of 'voiced' frames which will indicate the user has not finished speaking. If none provided, the default will be taken from your customer config file.
Default value: 5
Size range: -231..231-1

int endOfSpeechHistoryFrames

The number of historical frames to use when assessing that the user has finished speaking. If none provided, the default will be taken from your customer config file.
Default value: 50
Size range: -231..231-1

BOOL considerNegativeRatios

Whether or not to consider negative ratios when assessing speech activity. If none provided, the default will be taken from your customer config file.
Default value: false

BOOL stopOnEndOfSpeech

Whether or not to perform end-point detection. If none provided, the default will be taken from your customer config file.
Default value: true

BOOL wordStream

Whether or not to send intermediate recognition results. If none provided, the default will be taken from your customer config file.
Default value: true

NSArray *activeDynamicVocabularySets

The list of dynamic vocabulary (defined by their ids) to activate for this speech recognition command. NOTE: Do not upload more then 500 entries as this may have a noticeable performance degradation

int startOfSpeechTimeoutSeconds

How long to wait for start-of-speech before the server cancels a recognition, in seconds. Not applicable if endPointDetection is false. Use 0 (zero) for no limit.
Default value: 5
Size range: 0..231-1

int utteranceMaxTimeSeconds

The maximum allowed time for an utterance, in seconds. When endPointDetection is true, this time begins when start-of-speech is detected. Use 0 (zero) for no limit.
Default value: 30
Size range: 0..231-1

BOOL statistics

Whether or not to return statistics on the command's performance.
Default value: true

BOOL isNoiseLevelRequired

The specific background noise level energy values to use for speech detection
Default value: false

SessionValues class allows applicaiton to update the default parameters that are used to intialize a ASR/TTS session.

Following properties are exposed from SessionValues Class

Properties

NSString * speechSynthesisCodec

The codec to be used during speech synthesis. If none provided, the default will be taken from your customer config file.
Default value: pcm_16_8k
Allowed values: "pcm_16_8k", "pcm_16_16k", "opus_nb", "opus_wb", "ulaw"

NSString * speechRecognitionCodec

The codec to be used during speech recognition. If none provided, the default will be taken from your customer config file.
Default value: pcm_16_8k
Allowed values: "pcm_16_8k", "pcm_16_16k", "opus_nb", "opus_wb"

NSString * clientUserId

String defined by the client application to identify itself.
Default value: ""

NSString * clientDeviceId

String defined by the client application to identify the device.
Default value: ""

NSString * clientSessionId

String defined by the client application to identify its session.
Default value: ""

NSString * clientOsName

String defined by the client application to identify the OS it runs on.
Default value: "iOS"

NSString * clientOsVersion
Default value:"<device version>"

String defined by the client application to identify the OS it runs on.

Styling Translator Persona

Properties to customize Translator persona is exposed from NinaCobraProperties class. Application has to pass an instance of this class having modified properties into the framework.

public func setNinaActionSoundProperties(properties: NinaCobraProperties)

Following properties in NinaCobraProperties lets you change the look and feel of Translator persona.

Name type default description

successChimeSoundData NSData nil
You need to pass the sound data that gets played when SDK successfully recorded user text.

connectionSuccessChimeSoundData NSData nil
You need to pass the sound data that gets played when SDK established a session with the ASR/TTS server.

connectionFailedChimeSoundData NSData nil
You need to pass the sound data that gets played when SDK fails to establish a connection with ASR/TTS server or when unexpectedly disconnect.

errorChimeSoundData NSData nil
You need to pass the sound data that gets played when SDK fails to recorded user text.

ninaProcessingViewBackgroundColor UIColor UIColor.gray
Lets you configure the background color of processing view.

ninaProcessingViewLeftMargin Int 3
Lets you configure the processing view left margin.

ninaProcessingViewRightMargin Int 0
Lets you configure the processing view right margin.

ninaProcessingViewTopMargin Int 4
Lets you configure the processing view top margin.

ninaProcessingViewBottomMargin Int 2
Lets you configure the processing view bottom margin.

ninaProcessingText String Processing...
Lets you configure the processing view label text.

ninaProcessingTextColor UIColor UIColor.white
Lets you configure the processing view label text color.

ninaProcessingTextBackgroundColor UIColor UIColor.clear
Lets you configure the processing view label background color.

ninaProcessingTextFontStyle UIFont UIFont(name: kFontName, size: 15.0)
Lets you configure the processing view label font style.

ninaProcessingTextAllignment NSTextAlignment NSTextAlignment.center
Lets you configure the processing view label text alignment.

ninaProcessingTextLeftMargin Int 12
Lets you configure the processing view label text left margin.

ninaProcessingTextRightMargin Int 63
Lets you configure the processing view label text right margin.

ninaProcessingTextTopMargin Int 10
Lets you configure the processing view label text top margin.

ninaProcessingTextBottomMargin Int 10
Lets you configure the processing view label text bottom margin.

ninaSleepingAnimationImage UIImage nil
You must pass the name of image sequences that gets played when translator is sleeping. Please check the example for sample images

ninaProcessingAnimationImage UIImage nil
You must pass the name of image sequences that gets played when translator is processing. Please check the example for sample images

ninaAlertAnimationImage UIImage nil
You must pass the name of image sequences that gets played when translator encountered an error while recording. Please check the example for sample images

ninaSucessAnimationImage UIImage nil
You must pass the name of image sequences that gets played when translator recongnised and parsed user voice. Please check the example for sample images

ninaListeningAnimationImage UIImage nil
You must pass the name of image sequences that gets played when translator is recording user voice. Please check the example for sample images

ninaRecordingEnergyPromptImages UIImage nil
You must pass the name of image sequences that gets played to show user's speaking energy. Please check the example for sample images

ninaAnimCenterBackgroundImage UIImage nil
You can set an image that get displayed at the center of translator persona providedly other animation images are transparent. Please check the example for sample images

ninaIdleStaticImage UIImage nil
You can set an image that get displayed when the translator is in between all above states. Please check the example for sample images

Following properties in lets you change the layout of Translator persona to center above text input.

Name type default description

showNinaAnimInCenter Bool false
Lets you move the persona image to center above text input

ninaAnimCenterHolderHeight Int 40
Lets you set the height of view that holds persona image

ninaAnimCenterHolderColor UIColor UIColor.gray
Lets you set the color of view that holds persona image

ninaCenterViewBackgroundColor UIColor UIColor.gray
Lets you set the background color of container view that holds persona image and holder

Following properties in lets you change the style of Translator persona Overlay.

Name type default description

ninaNoInternetText String No Internet
Lets you change the string displayed in overlay when there is no internet

ninaOverlayTextColor UIColor UIColor.white
Lets you change the overlay text color

ninaOverlayBackgroundColor UIColor UIColor(red: 85.0/255.0, green: 85.0/255.0, blue: 85.0/255.0, alpha: 0.6)
Lets you change the overlay background color

ninaOverlayOpacity Float .6
Lets you change the overlay opacity

ninaOverlayFontStyle UIFont UIFont(name: kFontName, size: 6.5)!
Lets you change the overlay text font style

ninaNoInternetImage UIImage nil
Lets you set an image instead of text to show no internet

Following properties in lets you over ride the default ASR/TTS behaviour.

Name type default description

isAsrTTSRequired Bool false
Lets you turn on ASR/TTS functionality in SDK messaging screen.

enableContinousListening Bool false
Lets you turn on continuous listening mode.

playTTSOnRestore Bool false
Lets you configure how agent messages are read upon restore, by default it doesn't read any messages. App can configure to read all messages upon restore.

playTTSOnlyOnASRInput Bool false
Lets you configure how agent messages are read. By default agent messages are read alwasys. App can configure to read only if Speech Recongnition is used

playOpener Bool true
Lets you configure if you would like to play opener

Configuring WakeUp Word

WakeUp Word functionality lets customer to wake up translator engine and make the engine to immediately starts listening.

Wake Up funtionality is not part of the regular Nuance NDEP SDK release. Upon request Nuance will release a special build that has wake up work functionality added. In addition to that you will have to include iOSWakeupWord.framework and language pack files into your workspace.

Dependency not supported in M1 machines. It's removed from cocoapods package since v10.0.7. To use the wakeup word dependency download v10.0.6, refer Getting the SDK as a zip from CocaPods.

Following properties in NinaCobraProperties lets you configure the wakeup work functionality.

Name type default description

ninaWakeupWordRequired Bool false
Lets you turn on wake up work functionality.

ninaWakeupWordAcmodFilename String ""
Pass the name of acmod .dat file which is added in your application source code.

ninaWakeupWordClcFilename String ""
Pass the name of clc .dat file which is added in your application source code.

ninaWakeupWordArray [String] null
Lets you configure the wakeup words that engine will listen to.

Translator API

SDK exposes API which allows Application to programatically control the state of Translator .

Following API methods from NinaMobileController.getInstance() lets application to registar listeners for various Translator events.

- (void) addRecognitionResponseDelegage:(id_Nonnull)delegate;

NinaMobileRecognitionResponseDelegate events are fired to let Application know the various state of speech interpretation

-(void) recognitionNoMatch

-(void) recognitionNoInput

-(void) recognitionCancelled

-(void) recognitionIntermediateResponseAvailable:(NSString* _Nullable) text

-(void) recognitionFinalTranscriptionResponseAvailable:(NSString* _Nullable) text

- (void) addRecorderProgressDelegage:(id_Nonnull)delegate

MMFAudioProgressNotificationDelegateevents are fired to let Application know the various state of Recorder

-(void) recognitionStarted

-(void) recognitionStartOfSpeechDetected

-(void) recognitionEndOfSpeechDetected

-(void) recognitionStopped

-(void) recognitionEnergyDetected:(float)energy

- (void) addTTSPlaybackDelegage:(id_Nonnull)delegate

MMFTTSPlayBackNotificationDelegateevents are fired to let Application know the various TTS playback state

-(void) playbackStarted

-(void) playbackStopped

-(void) playbacksCompleted

-(void) errorDuringPlayback:(NSError*_Nullable) error

- (void) removeRecognitionResponseDelegate:(id_Nonnull)delegate

- (void) removeRecorderProgressDelegate:(id_Nonnull)delegate

- (void) removeTTSPlaybackDelegate:(id_Nonnull)delegate

Following API methods from NinaMobileController.getInstance() lets application to trigger Translator functionality.

-(void) startListeningAndDoRecognition

SDK will start listening to the speech input, Listeners from the above section will be fired to notify the app about the various state of the recognition

-(void) stopListening;

SDK stop listening to the speech,recognitionFinalTranscriptionResponseAvailable event will be fired with the final recognized text if any

-(void) cancelListening;

SDK cancel listening to the speech.

-(void) startPlaybackOf:(NSString * _Nullable)inputText;

SDK request for a TTS play back with the given string input. playbacksCompleted event will be fired when all queued prompts has finished playing.

-(void)stopPlayback;

SDK trys to cancel the prompt providely there is a prompt being played or queued.

-(Boolean) isPlaying;

Check TTS state with the Translator SDK.

-(Boolean) isRecording;

Check Recording state with the Translator SDK.

let serverConfiguration = NinaServerConfiguration.init() serverConfiguration.applicationKey = a == 0 ? "companyName_appName" : "NuancePS_PizzaAU" serverConfiguration.verificationCode = "5796d9e101d5355f5dbf95a3681f6ca5317fd9e96e5ffacac5e1305361a4eb1c" serverConfiguration.gateWayAddress = a == 0 ?"webapi-demo.nuance.mobi": "n4a-webapi.nuance.mobi" // webapi-demo.nuance.mobi // cobra-webapi.nuance.mobi serverConfiguration.gateWayPath = "webapi-platform/websocket" serverConfiguration.gateWayPort = "443" serverConfiguration.gateWayScheme = "wss" vc.setNinaServerConfigurationProperties(properties: serverConfiguration) let ninaSetting = NinaSetting.init() ninaSetting.sessionValues.speechRecognitionCodec = "pcm_16_16k" ninaSetting.sessionValues.speechSynthesisCodec = "pcm_16_16k" ninaSetting.synthesisValues.voice = a == 0 ? "Tom" : "Nathan" ninaSetting.synthesisValues.parameterType = "ssml" ninaSetting.serverTimeOut = 1.0 vc.setNinaSettingProperties(properties: ninaSetting)

Name	type	default	description
successChimeSoundData	NSData	nil	You need to pass the sound data that gets played when SDK successfully recorded user text.
connectionSuccessChimeSoundData	NSData	nil	You need to pass the sound data that gets played when SDK established a session with the ASR/TTS server.
connectionFailedChimeSoundData	NSData	nil	You need to pass the sound data that gets played when SDK fails to establish a connection with ASR/TTS server or when unexpectedly disconnect.
errorChimeSoundData	NSData	nil	You need to pass the sound data that gets played when SDK fails to recorded user text.
ninaProcessingViewBackgroundColor	UIColor	UIColor.gray	Lets you configure the background color of processing view.
ninaProcessingViewLeftMargin	Int	3	Lets you configure the processing view left margin.
ninaProcessingViewRightMargin	Int	0	Lets you configure the processing view right margin.
ninaProcessingViewTopMargin	Int	4	Lets you configure the processing view top margin.
ninaProcessingViewBottomMargin	Int	2	Lets you configure the processing view bottom margin.
ninaProcessingText	String	Processing...	Lets you configure the processing view label text.
ninaProcessingTextColor	UIColor	UIColor.white	Lets you configure the processing view label text color.
ninaProcessingTextBackgroundColor	UIColor	UIColor.clear	Lets you configure the processing view label background color.
ninaProcessingTextFontStyle	UIFont	UIFont(name: kFontName, size: 15.0)	Lets you configure the processing view label font style.
ninaProcessingTextAllignment	NSTextAlignment	NSTextAlignment.center	Lets you configure the processing view label text alignment.
ninaProcessingTextLeftMargin	Int	12	Lets you configure the processing view label text left margin.
ninaProcessingTextRightMargin	Int	63	Lets you configure the processing view label text right margin.
ninaProcessingTextTopMargin	Int	10	Lets you configure the processing view label text top margin.
ninaProcessingTextBottomMargin	Int	10	Lets you configure the processing view label text bottom margin.
ninaSleepingAnimationImage	UIImage	nil	You must pass the name of image sequences that gets played when translator is sleeping. Please check the example for sample images
ninaProcessingAnimationImage	UIImage	nil	You must pass the name of image sequences that gets played when translator is processing. Please check the example for sample images
ninaAlertAnimationImage	UIImage	nil	You must pass the name of image sequences that gets played when translator encountered an error while recording. Please check the example for sample images
ninaSucessAnimationImage	UIImage	nil	You must pass the name of image sequences that gets played when translator recongnised and parsed user voice. Please check the example for sample images
ninaListeningAnimationImage	UIImage	nil	You must pass the name of image sequences that gets played when translator is recording user voice. Please check the example for sample images
ninaRecordingEnergyPromptImages	UIImage	nil	You must pass the name of image sequences that gets played to show user's speaking energy. Please check the example for sample images
ninaAnimCenterBackgroundImage	UIImage	nil	You can set an image that get displayed at the center of translator persona providedly other animation images are transparent. Please check the example for sample images
ninaIdleStaticImage	UIImage	nil	You can set an image that get displayed when the translator is in between all above states. Please check the example for sample images

Name	type	default	description
showNinaAnimInCenter	Bool	false	Lets you move the persona image to center above text input
ninaAnimCenterHolderHeight	Int	40	Lets you set the height of view that holds persona image
ninaAnimCenterHolderColor	UIColor	UIColor.gray	Lets you set the color of view that holds persona image
ninaCenterViewBackgroundColor	UIColor	UIColor.gray	Lets you set the background color of container view that holds persona image and holder

Name	type	default	description
ninaNoInternetText	String	No Internet	Lets you change the string displayed in overlay when there is no internet
ninaOverlayTextColor	UIColor	UIColor.white	Lets you change the overlay text color
ninaOverlayBackgroundColor	UIColor	UIColor(red: 85.0/255.0, green: 85.0/255.0, blue: 85.0/255.0, alpha: 0.6)	Lets you change the overlay background color
ninaOverlayOpacity	Float	.6	Lets you change the overlay opacity
ninaOverlayFontStyle	UIFont	UIFont(name: kFontName, size: 6.5)!	Lets you change the overlay text font style
ninaNoInternetImage	UIImage	nil	Lets you set an image instead of text to show no internet

Name	type	default	description
isAsrTTSRequired	Bool	false	Lets you turn on ASR/TTS functionality in SDK messaging screen.
enableContinousListening	Bool	false	Lets you turn on continuous listening mode.
playTTSOnRestore	Bool	false	Lets you configure how agent messages are read upon restore, by default it doesn't read any messages. App can configure to read all messages upon restore.
playTTSOnlyOnASRInput	Bool	false	Lets you configure how agent messages are read. By default agent messages are read alwasys. App can configure to read only if Speech Recongnition is used
playOpener	Bool	true	Lets you configure if you would like to play opener

Name	type	default	description
ninaWakeupWordRequired	Bool	false	Lets you turn on wake up work functionality.
ninaWakeupWordAcmodFilename	String	""	Pass the name of acmod .dat file which is added in your application source code.
ninaWakeupWordClcFilename	String	""	Pass the name of clc .dat file which is added in your application source code.
ninaWakeupWordArray	[String]	null	Lets you configure the wakeup words that engine will listen to.

NuanceMessagingViewController

NinaServerConfiguration Properties

NuanceMessagingViewController

Properties

Properties

Properties

Properties

Styling Translator Persona

Configuring WakeUp Word

Translator API