Nuance Translator SDK
Nuance Translator SDK is a platform for adding Speech recognition and Speech synthesis (text-to-speech) capabilities to your mobile applications.Translator SDK is integrated to NuanceMessaging UI framework to provide ASR/TTS functionality to the messaging window.
This guide covers the API's that can be used to add ASR/TTS functionality to NuanceMessaging UI Framework
You can initialize the Translator SDK by passing an instance of NinaServerConfiguration class.
NuanceMessagingViewController
Pass the instance to the following method exposed from NuanceMessagingViewController
public func setNinaServerConfigurationProperties(properties: NinaServerConfiguration)
NinaServerConfiguration Properties
Following properties lets you configure the server configuration.
-
NSString *applicationKey
The required name of the application that will map to the server configuration, in the format CompanyName_AppName, for example "JavaBeanz_Orders"
-
NSString *authenticationType
The type of authentication. Allowd values:"jws","hashcode"
-
NSString *authenticationdata
Authentication data based on your authentication method.
-
NSString *verificationCode
The required verification code if using "verification code" as an authentication method.
-
NSString *gateWayScheme
Sets the scheme of the server address used for processing the speech and text.
Default value: "wss"
-
NSString *gateWayAddress
Sets the address of the server used for processing the speech and text.
-
NSString *gateWayPort
Sets the port of the server used for processing the speech and text.
Default value: "443"
-
NSString *gateWayPath
Sets the path of the server address used for processing the speech and text.
Default value: "webapi-platform/websocket"
NinaSetting class allows applicaiton to setup ASR/TTS configuration params.
NuanceMessagingViewController
Pass the instance to the following method exposed from NuanceMessagingViewController
public func setNinaSettingProperties(properties: NinaSetting)
Following properties are exposed from NinaSetting Class
Properties
-
SynthesisValues * synthesisValues
An instance of SynthesisValues object.
-
RecognitionValues *recognitionValues
An instance of RecognitionValues object.
-
SessionValues * sessionValues
An instance of SessionValues object.
SynthesisValues class allows applicaiton to update the default parameters that are used by TTS engine.
Following properties are exposed from SynthesisValues Class
Properties
-
NSString *parameterType
The type of data you provided.if none provided, the default will be taken from the customer config file.
Default value: text
Allowed values:"text","ssml"
-
BOOL statistics
Whether or not to return statistics on the command's performance:"true","false"
Default value: false
-
NSString *voice
The voice in which speech synthesis is to be done. if none provided, the default will be taken from your customer config file. Note this parameter works only with the type "text".
Default value: "Carol"
RecognitionValues class allows applicaiton to update the default parameters that are used by ASR engine.
Following properties are exposed from RecognitionValues Class
Properties
-
NSString *speechDetector
The speech detector to be used for voice activity detection. If none provided, the default will be taken from your customer config file.
Default value: legacy
Allowed values: "legacy", "adaptive"
-
int beginNoiseSampleFrames
The number of frames taken at the start of recognition to assess the environment's noise level. If none provided, the default will be taken from your customer config file.
Default value: 10
Size range: 1..231-1
-
double voiceThreshold
The amount of energy require for a frame to be considered 'voiced', as a factor of the background noise level. If none provided, the default will be taken from your customer config file.
Default value: 4.0
Size range: -263..263-1 -
int startOfSpeechVoicedFrames
Of the most recent startOfSpeechHistoryFrames, the number of 'voiced' frames required to assess that user has started speaking. If none provided, the default will be taken from your customer config file.
Default value: 7
Size range: -231..231-1 -
int startOfSpeechHistoryFrames
The number of historical frames to use when assessing that the user has started speaking. If none provided, the default will be taken from your customer config file.
Default value: 15
Size range: -231..231-1 -
int endOfSpeechVoicedFrames
Of the most recent endOfSpeechHistoryFrames, the number of 'voiced' frames which will indicate the user has not finished speaking. If none provided, the default will be taken from your customer config file.
Default value: 5
Size range: -231..231-1 -
int endOfSpeechHistoryFrames
The number of historical frames to use when assessing that the user has finished speaking. If none provided, the default will be taken from your customer config file.
Default value: 50
Size range: -231..231-1 -
BOOL considerNegativeRatios
Whether or not to consider negative ratios when assessing speech activity. If none provided, the default will be taken from your customer config file.
Default value: false -
BOOL stopOnEndOfSpeech
Whether or not to perform end-point detection. If none provided, the default will be taken from your customer config file.
Default value: true -
BOOL wordStream
Whether or not to send intermediate recognition results. If none provided, the default will be taken from your customer config file.
Default value: true -
NSArray *activeDynamicVocabularySets
The list of dynamic vocabulary (defined by their ids) to activate for this speech recognition command. NOTE: Do not upload more then 500 entries as this may have a noticeable performance degradation
-
int startOfSpeechTimeoutSeconds
How long to wait for start-of-speech before the server cancels a recognition, in seconds. Not applicable if endPointDetection is false. Use 0 (zero) for no limit.
Default value: 5
Size range: 0..231-1 -
int utteranceMaxTimeSeconds
The maximum allowed time for an utterance, in seconds. When endPointDetection is true, this time begins when start-of-speech is detected. Use 0 (zero) for no limit.
Default value: 30
Size range: 0..231-1 -
BOOL statistics
Whether or not to return statistics on the command's performance.
Default value: true -
BOOL isNoiseLevelRequired
The specific background noise level energy values to use for speech detection
Default value: false
SessionValues class allows applicaiton to update the default parameters that are used to intialize a ASR/TTS session.
Following properties are exposed from SessionValues Class
Properties
-
NSString * speechSynthesisCodec
The codec to be used during speech synthesis. If none provided, the default will be taken from your customer config file.
Default value: pcm_16_8k
Allowed values: "pcm_16_8k", "pcm_16_16k", "opus_nb", "opus_wb", "ulaw" -
NSString * speechRecognitionCodec
The codec to be used during speech recognition. If none provided, the default will be taken from your customer config file.
Default value: pcm_16_8k
Allowed values: "pcm_16_8k", "pcm_16_16k", "opus_nb", "opus_wb" -
NSString * clientUserId
String defined by the client application to identify itself.
Default value: "" -
NSString * clientDeviceId
String defined by the client application to identify the device.
Default value: "" -
NSString * clientSessionId
String defined by the client application to identify its session.
Default value: "" -
NSString * clientOsName
String defined by the client application to identify the OS it runs on.
Default value: "iOS" -
NSString * clientOsVersion
Default value:"<device version>"String defined by the client application to identify the OS it runs on.
Styling Translator Persona
Properties to customize Translator persona is exposed from NinaCobraProperties class. Application has to pass an instance of this class having modified properties into the framework.
public func setNinaActionSoundProperties(properties: NinaCobraProperties)
Following properties in NinaCobraProperties lets you change the look and feel of Translator persona.
Name | type | default | description |
---|---|---|---|
successChimeSoundData | NSData | nil |
You need to pass the sound data that gets played when SDK successfully recorded user text. |
connectionSuccessChimeSoundData | NSData | nil |
You need to pass the sound data that gets played when SDK established a session with the ASR/TTS server. |
connectionFailedChimeSoundData | NSData | nil |
You need to pass the sound data that gets played when SDK fails to establish a connection with ASR/TTS server or when unexpectedly disconnect. |
errorChimeSoundData | NSData | nil |
You need to pass the sound data that gets played when SDK fails to recorded user text. |
ninaProcessingViewBackgroundColor | UIColor | UIColor.gray |
Lets you configure the background color of processing view. |
ninaProcessingViewLeftMargin | Int | 3 |
Lets you configure the processing view left margin. |
ninaProcessingViewRightMargin | Int | 0 |
Lets you configure the processing view right margin. |
ninaProcessingViewTopMargin | Int | 4 |
Lets you configure the processing view top margin. |
ninaProcessingViewBottomMargin | Int | 2 |
Lets you configure the processing view bottom margin. |
ninaProcessingText | String | Processing... |
Lets you configure the processing view label text. |
ninaProcessingTextColor | UIColor | UIColor.white |
Lets you configure the processing view label text color. |
ninaProcessingTextBackgroundColor | UIColor | UIColor.clear |
Lets you configure the processing view label background color. |
ninaProcessingTextFontStyle | UIFont | UIFont(name: kFontName, size: 15.0) |
Lets you configure the processing view label font style. |
ninaProcessingTextAllignment | NSTextAlignment | NSTextAlignment.center |
Lets you configure the processing view label text alignment. |
ninaProcessingTextLeftMargin | Int | 12 |
Lets you configure the processing view label text left margin. |
ninaProcessingTextRightMargin | Int | 63 |
Lets you configure the processing view label text right margin. |
ninaProcessingTextTopMargin | Int | 10 |
Lets you configure the processing view label text top margin. |
ninaProcessingTextBottomMargin | Int | 10 |
Lets you configure the processing view label text bottom margin. |
ninaSleepingAnimationImage | UIImage | nil |
You must pass the name of image sequences that gets played when translator is sleeping. Please check the example for sample images |
ninaProcessingAnimationImage | UIImage | nil |
You must pass the name of image sequences that gets played when translator is processing. Please check the example for sample images |
ninaAlertAnimationImage | UIImage | nil |
You must pass the name of image sequences that gets played when translator encountered an error while recording. Please check the example for sample images |
ninaSucessAnimationImage | UIImage | nil |
You must pass the name of image sequences that gets played when translator recongnised and parsed user voice. Please check the example for sample images |
ninaListeningAnimationImage | UIImage | nil |
You must pass the name of image sequences that gets played when translator is recording user voice. Please check the example for sample images |
ninaRecordingEnergyPromptImages | UIImage | nil |
You must pass the name of image sequences that gets played to show user's speaking energy. Please check the example for sample images |
ninaAnimCenterBackgroundImage | UIImage | nil |
You can set an image that get displayed at the center of translator persona providedly other animation images are transparent. Please check the example for sample images |
ninaIdleStaticImage | UIImage | nil |
You can set an image that get displayed when the translator is in between all above states. Please check the example for sample images |
Following properties in lets you change the layout of Translator persona to center above text input.
Name | type | default | description |
---|---|---|---|
showNinaAnimInCenter | Bool | false |
Lets you move the persona image to center above text input |
ninaAnimCenterHolderHeight | Int | 40 |
Lets you set the height of view that holds persona image |
ninaAnimCenterHolderColor | UIColor | UIColor.gray |
Lets you set the color of view that holds persona image |
ninaCenterViewBackgroundColor | UIColor | UIColor.gray |
Lets you set the background color of container view that holds persona image and holder |
Following properties in lets you change the style of Translator persona Overlay.
Name | type | default | description |
---|---|---|---|
ninaNoInternetText | String | No Internet |
Lets you change the string displayed in overlay when there is no internet |
ninaOverlayTextColor | UIColor | UIColor.white |
Lets you change the overlay text color |
ninaOverlayBackgroundColor | UIColor | UIColor(red: 85.0/255.0, green: 85.0/255.0, blue: 85.0/255.0, alpha: 0.6) |
Lets you change the overlay background color |
ninaOverlayOpacity | Float | .6 |
Lets you change the overlay opacity |
ninaOverlayFontStyle | UIFont | UIFont(name: kFontName, size: 6.5)! |
Lets you change the overlay text font style |
ninaNoInternetImage | UIImage | nil |
Lets you set an image instead of text to show no internet |
Following properties in lets you over ride the default ASR/TTS behaviour.
Name | type | default | description |
---|---|---|---|
isAsrTTSRequired | Bool | false |
Lets you turn on ASR/TTS functionality in SDK messaging screen. |
enableContinousListening | Bool | false |
Lets you turn on continuous listening mode. |
playTTSOnRestore | Bool | false |
Lets you configure how agent messages are read upon restore, by default it doesn't read any messages. App can configure to read all messages upon restore. |
playTTSOnlyOnASRInput | Bool | false |
Lets you configure how agent messages are read. By default agent messages are read alwasys. App can configure to read only if Speech Recongnition is used |
playOpener | Bool | true |
Lets you configure if you would like to play opener |
Configuring WakeUp Word
WakeUp Word functionality lets customer to wake up translator engine and make the engine to immediately starts listening.
Wake Up funtionality is not part of the regular Nuance NDEP SDK release. Upon request Nuance will release a special build that has wake up work functionality added. In addition to that you will have to include iOSWakeupWord.framework and language pack files into your workspace.
Following properties in NinaCobraProperties lets you configure the wakeup work functionality.
Name | type | default | description |
---|---|---|---|
ninaWakeupWordRequired | Bool | false |
Lets you turn on wake up work functionality. |
ninaWakeupWordAcmodFilename | String | "" |
Pass the name of acmod .dat file which is added in your application source code. |
ninaWakeupWordClcFilename | String | "" |
Pass the name of clc .dat file which is added in your application source code. |
ninaWakeupWordArray | [String] | null |
Lets you configure the wakeup words that engine will listen to. |
Translator API
SDK exposes API which allows Application to programatically control the state of Translator .
Following API methods from NinaMobileController.getInstance() lets application to registar listeners for various Translator events.
-
- (void) addRecognitionResponseDelegage:(id
_Nonnull)delegate; NinaMobileRecognitionResponseDelegate events are fired to let Application know the various state of speech interpretation
-(void) recognitionNoMatch
-(void) recognitionNoInput
-(void) recognitionCancelled
-(void) recognitionIntermediateResponseAvailable:(NSString* _Nullable) text
-(void) recognitionFinalTranscriptionResponseAvailable:(NSString* _Nullable) text
-
- (void) addRecorderProgressDelegage:(id
_Nonnull)delegate MMFAudioProgressNotificationDelegateevents are fired to let Application know the various state of Recorder
-(void) recognitionStarted
-(void) recognitionStartOfSpeechDetected
-(void) recognitionEndOfSpeechDetected
-(void) recognitionStopped
-(void) recognitionEnergyDetected:(float)energy
-
- (void) addTTSPlaybackDelegage:(id
_Nonnull)delegate MMFTTSPlayBackNotificationDelegateevents are fired to let Application know the various TTS playback state
-(void) playbackStarted
-(void) playbackStopped
-(void) playbacksCompleted
-(void) errorDuringPlayback:(NSError*_Nullable) error
-
- (void) removeRecognitionResponseDelegate:(id
_Nonnull)delegate -
- (void) removeRecorderProgressDelegate:(id
_Nonnull)delegate -
- (void) removeTTSPlaybackDelegate:(id
_Nonnull)delegate
Following API methods from NinaMobileController.getInstance() lets application to trigger Translator functionality.
-
-(void) startListeningAndDoRecognition
SDK will start listening to the speech input, Listeners from the above section will be fired to notify the app about the various state of the recognition
-
-(void) stopListening;
SDK stop listening to the speech,recognitionFinalTranscriptionResponseAvailable event will be fired with the final recognized text if any
-
-(void) cancelListening;
SDK cancel listening to the speech.
-
-(void) startPlaybackOf:(NSString * _Nullable)inputText;
SDK request for a TTS play back with the given string input. playbacksCompleted event will be fired when all queued prompts has finished playing.
-
-(void)stopPlayback;
SDK trys to cancel the prompt providely there is a prompt being played or queued.
-
-(Boolean) isPlaying;
Check TTS state with the Translator SDK.
-
-(Boolean) isRecording;
Check Recording state with the Translator SDK.
let serverConfiguration = NinaServerConfiguration.init()
serverConfiguration.applicationKey = a == 0 ? "companyName_appName" : "NuancePS_PizzaAU"
serverConfiguration.verificationCode = "5796d9e101d5355f5dbf95a3681f6ca5317fd9e96e5ffacac5e1305361a4eb1c"
serverConfiguration.gateWayAddress = a == 0 ?"webapi-demo.nuance.mobi": "n4a-webapi.nuance.mobi" // webapi-demo.nuance.mobi // cobra-webapi.nuance.mobi
serverConfiguration.gateWayPath = "webapi-platform/websocket"
serverConfiguration.gateWayPort = "443"
serverConfiguration.gateWayScheme = "wss"
vc.setNinaServerConfigurationProperties(properties: serverConfiguration)
let ninaSetting = NinaSetting.init()
ninaSetting.sessionValues.speechRecognitionCodec = "pcm_16_16k"
ninaSetting.sessionValues.speechSynthesisCodec = "pcm_16_16k"
ninaSetting.synthesisValues.voice = a == 0 ? "Tom" : "Nathan"
ninaSetting.synthesisValues.parameterType = "ssml"
ninaSetting.serverTimeOut = 1.0
vc.setNinaSettingProperties(properties: ninaSetting)