Voice instructions

VERSION 0.2.2746
PUBLIC PREVIEW

The Navigation SDK for iOS is only available upon request. Contact us to get started.

The Navigation module uses Text To Speech (TTS) functionality for generating voice instructions. TTS is provided as separate modules. This means that you have to add TomTomSDKTextToSpeech and TomTomSDKDefaultTextToSpeech dependencies in the Podfile and install it by executing the pod install command. Once the modules are added to your target, import the following frameworks:

1import TomTomSDKRoute
2import TomTomSDKRoutePlanner
3import TomTomSDKTextToSpeech
4import TomTomSDKTextToSpeechEngine
5
6// MARK: - TextToSpeechProvider

TTS engine

The TTS engine is responsible for providing voice synthesis for messages. The Navigation module can use TomTomSDKDefaultTextToSpeech based on the AVFoundation framework. Alternatively, you can also define a custom engine for text to speech conversion. Any custom engine must conform to the TextToSpeechEngine interface.

The TextToSpeech class is a facade for performing operations on TextToSpeechEngine. It takes care of queuing messages based on priority.

You can create TextToSpeech in two different ways.

  • A TextToSpeech that uses the default SystemTextToSpeechEngine engine underneath:
    1let languageCode = "en-Gb"
    2let tts = SystemTextToSpeechEngine(language: languageCode)
    3self.tts = makeTextToSpeech(ttsEngine: tts)
  • A TextToSpeech with a custom TextToSpeechEngine for voice synthesis:
    tts = makeTextToSpeech(ttsEngine: customEngine)

TextToSpeechEngineDelegate

You can listen for whether the TextToSpeechEngine is ready to be used. A readiness notification is sent via TextToSpeechEngineDelegate::onReady().

1func onReady() {
2 /* YOUR CODE GOES HERE */
3}

Playing messages

The TextToSpeech.play(message: TTSMessage, priority: TTSMessagePriority) method synthesizes the provided message using the underlying TextToSpeechEngine. The TTSMessage parameter is an enum with two cases: * audio(message: String, type: MessageType) - message with handling options, where MessageType can be plain or SSML. * tagged(message: String, phonetics: [PhoneticTranscription]) - message with phonetic transcriptions that should be parsed for correct playback. Message queuing depends on priority. If the message that is currently being synthesized has an equal or higher priority to the new message, the new message will be added to the queue (taking the priorities of queued messages into account). If the message that is currently being synthesized has a lower priority than the new one, it will be interrupted and the new message will be processed right away. Messages remain in the queue for the time defined as the priority timeout. There are default timeouts defined for each priority level, but custom timeouts can also be provided:

let priority = TTSMessagePriority(timeout: 120)

Playing an audio message

To play an audio message, use the TextToSpeech.play(message: TTSMessage, priority: TTSMessagePriority) method.

1let message = "In 300 meters turn left"
2let audioMessage = TTSMessage.audio(message: message, type: .plain)
3let priority = TTSMessagePriority(timeout: 10)
4tts.play(message: audioMessage, priority: priority)

The audio message can also be provided in Speech Synthesis Markup Language (SSML) format.

1let ssmlMessage =
2 "<speak>Turn left onto <phoneme alphabet='ipa' ph='e¬¬.¬f¬¬¬'>A4</phoneme> " +
3 "towards <phoneme alphabet='ipa'ph=''sxep.fart.my.'2ze.^m'>Scheepvaartmuseum</phoneme></speak>"
4let audioMessage = TTSMessage.audio(message: ssmlMessage, type: .ssml)
5let priority = TTSMessagePriority(timeout: 10)
6tts.play(message: audioMessage, priority: priority)

Playing a tagged message

You can also pass the tagged message with phonetics to be substituted using the same TextToSpeech.play(message: TTSMessage, priority: TTSMessagePriority) method. To create a tagged message, provide the message along with the tags to be synthesized as in the example. The second parameter for tagged message generation is PhoneticTranscription. To create a PhoneticTranscription, provide:

  • List of phonetic transcriptions of phrases that are tagged in the message.
  • List of language codes in IETF format, sorted in the same order as the transcriptions.
  • Tag surrounding the phrase within the message.
  • Phonetic alphabet of the transcriptions - "ipa" in our case.
1let roadNumbersPhonetics = TomTomSDKTextToSpeechEngine.PhoneticTranscription(
2 transcriptions: ["e¬¬.¬f¬¬¬"],
3 languageCodes: ["nl-NL"],
4 tag: "roadNumber",
5 alphabet: "ipa"
6)
7
8let signpostPhonetics = TomTomSDKTextToSpeechEngine.PhoneticTranscription(
9 transcriptions: ["'sxep.fart.my.'2ze.^m"],
10 languageCodes: ["nl-NL"],
11 tag: "signpostText",
12 alphabet: "ipa"
13)
14
15let phonetics = [roadNumbersPhonetics, signpostPhonetics]
16let message = "Turn left onto <roadNumber>A4</roadNumber> towards<signpostText>Scheepvaartmuseum</signpostText>"
17let taggedMessage = TTSMessage.tagged(message: message, phonetics: phonetics)
18
19let priority = TTSMessagePriority(timeout: 10)
20tts.play(message: taggedMessage, priority: priority)

Tracking voice synthesis status

To track the status of message synthesis, use the TextToSpeechEngineDelegate set on the implementation of the TextToSpeechEngine. All the methods provided by the delegate are optional. The following methods are available:

1func didStart(message: TTSMessage) {
2 /* YOUR CODE GOES HERE */
3}
4
5func didStop(message: TTSMessage) {
6 /* YOUR CODE GOES HERE */
7}
8
9func didPause(message: TTSMessage) {
10 /* YOUR CODE GOES HERE */
11}
12
13func didContinue(message: TTSMessage) {
14 /* YOUR CODE GOES HERE */
15}
16
17func didFinish(message: TTSMessage, withError: Error) {
18 /* YOUR CODE GOES HERE */
19}

Language

The language of the underlying engine can be changed. This can be done via the TextToSpeech constructor or with the TextToSpeech.changeLanguage(languageCode: String) method.

tts.changeLanguage(languageCode: "en-US")

Volume

The volume of the underlying engine can be changed using TextToSpeech.setVolume(_ volume: SpeechVolume) method. SpeechVolume is an enum with 3 states: low, medium, high. The volume is set to medium level by default.

tts.setVolume(.high)

Stop voice synthesis

The voice synthesis can be stopped with the TextToSpeech.stop() method.

tts.stop()

TTS integration

While navigating, Navigation generates a guidance update after each location change. Generated guidance contains an announcement if the distance is in a suitable range. NavigationGuidanceObserver::func didGenerateAnnouncement(announcement: GuidanceAnnouncement) is triggered when an announcement is generated. GuidanceAnnouncement info can be used for voice instructions. You can find more details in the turn-by-turn navigation guide.

1 func didGenerateAnnouncement(announcement: GuidanceAnnouncement) {
2 guard !announcement.message.isEmpty else { return }
3 let phonetics = announcement.messagePhonetics
4
5 var voicePhonetics = [TomTomSDKTextToSpeechEngine.PhoneticTranscription]()
6 if let street = phonetics?.street,
7 let languageCode = phonetics?.streetLanguageCode {
8 let streetTranscription = TomTomSDKTextToSpeechEngine.PhoneticTranscription(
9 transcriptions: [street],
10 languageCodes: [languageCode],
11 tag: DefaultTags.streetName,
12 alphabet: DefaultPhoneticAlphabets.ipa
13 )
14 voicePhonetics.append(streetTranscription)
15 }
16 if let phonetics = phonetics,
17 !phonetics.roadNumberLanguageCodes.isEmpty {
18 let roadNumbersLanguageCode = phonetics.roadNumberLanguageCodes
19 let roadNumbers = phonetics.roadNumbers
20 let roadNumbersTranscription = TomTomSDKTextToSpeechEngine.PhoneticTranscription(
21 transcriptions: roadNumbers,
22 languageCodes: roadNumbersLanguageCode,
23 tag: DefaultTags.roadNumber,
24 alphabet: DefaultPhoneticAlphabets.ipa
25 )
26 voicePhonetics.append(roadNumbersTranscription)
27 }
28
29 let message: TTSMessage = .tagged(message: announcement.message, phonetics: voicePhonetics)
30 let priority = TTSMessagePriority(timeout: 120)
31
32 tts.play(message: message, priority: priority)
33 }