Three Methods for Implementing a Dialogflow Telephony Gateway To implement this, the gateway needs to keep some state on the user and manage interactions between the voice telephony and SMS environment. In fact, this could allow ongoing dialog after the phone call has finished. If you are going to send SMS, you should also be prepared to receive SMS and interact via text without requiring voice. The gateway should help determine if the caller is on an SMS-capable device and send them text messages if that would help in the interaction. In practice, most callers use their mobile phones to call, which means they should be able to receive text messages. A restaurant wouldn’t want to read off an entire menu - it is easier to just send a link to this. There are many cases when it is easier to send the user a link. This could allow for better coverage of custom vocabularies or unique voice synthesis. The scope of our project was fairly limited, but other options could include customized Speech-to-Text (STT) and Text-to-Speech (TTS) engines instead of using the ones built into Dialogflow. DTMF detection - even if the goal is to eliminate DTMF menus, sometimes it is nice to have DTMF as a backup option or alternative input method - especially if you are trying to do something like capture a phone number and Dialogflow cannot understand the caller.If you were using Dialogflow to make a Google Assistant bot, they give you an actions_intent_NO_INPUT event and mechanisms to setup reprompt intents - the gateway needs to provide something similar No activity detection - if you were in the middle of a conversation and it suddenly went silent, you would say “are you there”? The bot needs the mechanism to do something similar.Playback interruption - ideally your voicebot could handle an asynchronous conversation, so the user could interrupt whatever the bot is saying and that speech would be processed.Call transfer - in most cases you need to give the user an option to talk to a human, or the natural result of a bot will be to transfer the call.Recording - not a hard requirement, but having a full recording of both parties is invaluable for debugging and improving the system.What else should the Dialogflow Gateway handle?īeyond basic connectivity, there are a few other features that will help to improve development and user interaction.Ī short list of the top features we evaluated is: The gateway also needs to play back the response speech generated by Dialogflow (or use its own Text-to-Speech mechanism to vocalize Dialogflow’s response text). The gateway needs to convert the SRTP or RTP media used by the telephony end to a gRPC bitstream using Dialogflow-friendly codecs. Dialogflow’s interface for real time speech input is gRPC. Slightly more complicated is the media conversation that needs to take place. This also includes handling hang-ups and the termination of the call. On the signaling side, the gateway needs to take the telephony signaling - which is almost always based on SIP - and use that to invoke the proper Dialogflow commands to launch and interact with the bot. To do this, there needs to be some kind of gateway that handles both signaling and media conversion. We want to be able to dial a phone number and have Dialogflow handle the interaction as a voicebot IVR. Let’s first review what’s involved in making this connection and some nice-to-have features before reviewing the methods.
In this first post, we want to share some of the methods we explored to connecting Dialogflow to a phone call.
We decided to get together to share our research and experiments in this domain. Emiliano has worked on similar projects with Chad in the past and has continued researching this area. Emiliano is a developer at webRTC.ventures. Joining me to help with this series is Emiliano Pelliccioni. Beyond just building simple demo systems, Chad had been exploring improved ways of using Dialogflow to implement an IVR replacement for telephony environments. We have covered this topic a few times here, including looking at Dialogflow’s own Phone Gateway and the Gateway interface implementations of VoxImplant and SignalWire. We are starting up a new series on using Dialogflow as an Interactive Voice Response (IVR) replacement.