How to build a personalized IVR with DTMF and Speech 

TechTalks How to build a personalized IVR with DTMF and Speech
Share with

Interactive Voice Response (IVR) is a powerful technology which enables businesses to provide exceptional customer service through automated assistance, routing, and information. IVR systems can process Speech and touch-tone inputs from callers, making it easy to streamline processes, reduce costs, and improve customer satisfaction.  

This technology has become an essential tool across various industries and applications, including customer service, healthcare, banking and financial services, surveys and polling, marketing, and order processing. Using Voice APIs, modern speech to text technology and stacks such as WebRTC, developers can create remarkably efficient IVR systems.  

In this blog, we will learn how to build a modern, interactive IVR system.  

Personalised IVR Systems: A key enabler for customer experience and delight 

By offering personalized support through IVR systems, businesses can ensure that their customers receive a high level of service that meets their specific needs. This technology allows customers to interact with the business in a way that is convenient for them, and it can be tailored to suit the needs of different businesses and organizations. With IVR systems, businesses can save time and money while providing exceptional customer service, making it a valuable investment for any business looking to improve their customer experience. 

IVR technology enables individuals to communicate with computer systems by using either speech or touch-tone key inputs. Pre-recorded voice messages or text-to-speech technologies, coupled with DTMF (Dual-tone multi-frequency) interfaces, enable callers to access information and provide information without the need to speak with a live agent using IVR systems.  

With the advancement of automatic speech recognition technology, callers are now able to verbally express their needs over the telephone, and IVR systems respond accordingly in real time. In addition, if the IVR system cannot provide the information that the caller requires, it can direct the caller to the appropriate representative for assistance. 

Key paths to build an interactive voice response system: 


DTMF is a way to send digits over a voice channel using two separate tones. Firstly, the DTMF input needs to be captured using a device such as a telephone or IVR system. Following this, the DTMF tones are converted into the corresponding digit using a DTMF decoder. After that, the digits are interpreted, and the action to be taken is determined based on the received input. Finally, the appropriate action is taken, such as routing the user to a specific department or providing them with information. 

Speech Recognition: 

Speech recognition is the process of converting spoken words into text. The speech input is captured using a device such as a microphone or voice-enabled remote control. Next, a speech recognition engine is used to convert the spoken words into text. The text is then interpreted, and the action to be taken is determined based on the received input. Finally, the appropriate action is taken, such as routing the user to a specific department or providing them with information. For DTMF input, the IVR system might play a menu prompt that lists available options or actions, while for speech recognition input, the system might summarize the user’s request and confirm that the system is processing their request. 

The IVR system might play a menu prompt for DTMF input that lists available options or actions, while for speech recognition input, the system might summarize the user’s request and confirm that the system is processing their request. 

Benefits of Combining DTMF and Speech Recognition in an IVR System 

The benefits of combining DTMF and speech recognition in an IVR system include: 

  • Improved user experience: Combining DTMF and speech recognition allows users to choose the interaction method that they prefer, which can improve the overall user experience. Some users may prefer to use DTMF, while others may prefer to speak their commands. 
  • Increased accessibility: By offering both DTMF and speech recognition options, IVR systems can be made more accessible to people with disabilities, such as those who are deaf or hard of hearing. 
  • More natural interactions: Speech recognition can provide a more natural and intuitive interaction for users, particularly when dealing with complex menu options or lengthy prompts. 
  • Enhanced functionality: Combining DTMF and speech recognition can enable IVR systems to offer more advanced functionality, such as voice-based authentication or natural language processing. 
  • Improved efficiency: Speech recognition can speed up the interaction process, reducing the time users spend navigating through menus and prompts, and freeing up resources for more complex tasks. 

Build an IVR system with EnableX 

Building, deploying, scaling, and updating modern IVR systems should be straightforward with the right tools. That’s where EnableX voice APIs come in – they empower you to build customized workflows and solutions in a matter of days and make it easy to adapt and iterate as needed for ongoing improvement.  

The process is as straightforward as following these steps: 

  • Login to EnableX portal. If you do not have an account, Signup here. 
  • Create a voice project. 
  • Get Access to the App Credentials. 
  • Create voice prompts for your IVR system, either by audio Recording or by using text-to-speech technology provided by the EnableX API platform. 
  • Make sure they are saved in a compatible format for your API platform. 
  • Buy a phone number from EnableX Inventory. 
  • Add phone number to the voice project. 
  • Configure your IVR system to play the appropriate prompts based on user input, either through touch-tone keypad inputs (DTMF) or speech recognition. 
  • Configure webhook URL for receiving event call back on Call State / Play State changes. 

EnableX provides a suite of features that can be used to create Interactive Voice Response (IVR) systems for personalized support using DTMF and speech recognition. These features include handling incoming voice calls, making outbound voice calls, and processing call events. 

  • Handle incoming calls by processing the events received at the webhook URL, which carry a unique identifier called the voice-id for further API calls and events affecting the same inbound call. 
  • Make outbound calls Using proper API endpoint and request parameters.  
  • Add prompts and play them 
  • Add prompts and play them via IVR: Once the call is connected, you’ll need to add the prompts you created to the call and play them via the IVR system. This will typically involve making additional API requests to the platform to add the prompts and specify how they should be played (e.g., via DTMF or speech recognition). 

Incoming Call 

When an incoming call is received by the EnableX Voice Service, the system sends a notification to a web address you have configured, containing details such as the caller’s phone number and a unique identifier called the voice-id. You can use this identifier to track and manage the call throughout its lifecycle. You can use DTMF tones or speech recognition to gather information from the caller and customize the response based on their input. 

Webhook Event sent by Voice Server when it receives an Inbound Call on EnableX number 

    "voice_id" :  "f1aa71c0-8f2a-4fe8-b5ef-9a330454ef58", //voice id of the incoming call,  
    "state": "incomingcall",  
    "from" : "from number",  
    "to" : "EnableX number",  
    "channel_id" : "" // Channel Id of the inbound call  
    "timestamp" : "2020-08-20T09:03:09.893Z" 

Accept Call voice API can be invoked to answer an incoming call on incoming call event sent by Voice Server on receiving an inbound Call on an EnableX number. 

const axios = require('axios');  
const btoa = require('btoa');  
var URL = "{voice_id}/accept"; 
URL = URL.replace("{voice_id}", "f1aa71c0-8f2a-4fe8-b5ef-9a330454ef58") 
const HttpVerb = "PUT" 
const APP_ID = process.env.ENABLEX_APP_ID;  
const APP_KEY = process.env.ENABLEX_APP_KEY;   
const authKey = btoa(`${APP_ID}:${APP_KEY}`); 
const options = {   
    headers: {   
      "Authorization": "Basic ${authKey}",   
      "Content-Type": "application/json"   

Successful and Error response for accept Call API 

    "voice_id": " f1aa71c0-8f2a-4fe8-b5ef-9a330454ef58", 
    "status": "success", 
    "timestamp": "2020-08-20T09:03:09.893Z " 

Add Prompts 

 Use the EnableX API to add prompts to the IVR system. You can create audio files or text-to-speech messages to prompt customers. 

const data = {  
    "name" : "Service Name",  
    "from" : " from number ",  
    "to" : " EnableX number ",  
    "action_on_connect" : {  
      //Text to Speech with advance options  
      "play" : {  
        "text_to_speech_option" : {  
            "text" : "good morning welcome to text to speech demo",  
            //text will be played after 2 seconds.  
            "silence" : {  
                "type" : "leading",   
                "value" : "2000ms"  
            //text to be played after 2 seconds break",  
             "break" : {  
                "time" : "2000ms"  
            //text will be played in gentle voice  
            "express-as" : {  
                "style" : "gentle",   
                 "text" : "text will be played in gentle voice"   
            //"text" : "text played at 30% higher rate and 50% volume"  
            "prosody" : {  
                "rate" : "+30.00%",  
                "pitch" : "high",  
                "volume" : "+50.00%",  
            //<say-as interpret-as="date" format="mdy">10-19-2016</say-as>  
            "say-as" : {  
                "interpret-as" : "date",  
                "format" :"mdy"  

Receive DTMF and Speech recognition 

Use the EnableX API to recognize the customer’s input, either through DTMF (Dual-tone multi-frequency) tones or through speech recognition. 

{  { 
    "voice_Id": "f1aa71c0-8f2a-4fe8-b5ef-9a330454ef58", // Call_Id 
    "state": "dtmfcollected", 
    "digit": 12345,      
    "timestamp": "2017-02-16T10:52:00Z" 

    "voice_id": "6b211e69-c2cc-4a65-99e9-f7a3c6922796",  
    "state": "recognized |  timeout | unrecognized",     
    "from": "CLI number",    "to": "Destination number",     
    "text": "No, I'm not available.",     
    "reason": "Reczognized"    
    "playstate": "speech_recognized",    
    "timestamp": "2017-02-16T10:52:00Z"   
    "prompt_ref": "welcome-prompt"  

Play prompt 

After receiving DTMF or Speech recognition inputs, you can play Prompts depends on the specific use case and application. In general, after receiving the input, the system should play a confirmation prompt to let the user know that their input was received correctly. 

const data = { 
    "prompt_name":"prompt file name to be played", //Identifier of the prompt as configured on the portal. 
    "text": "text to be played", // Text to be played by the TTS. NOTE:  Either text or prompt will be be played 
    "language": "es-US", // Language to be spoken 
    "voice": "female", //For TTS only. Possible value is male or female 
    "dtmf": true, // If dtmf is true, it will wait for digits     
    "prompt_ref": "welcome_ref", // optional parameter for User specified prompt_ref. 
    "asr": true, // IVR via speech recognition enabled 
    "start_timeout" :10, // Initial Silence Timeout before user start speaking. 
    "recognizer_timeout": 1 // Wait before the voice server returns the result after silence. 

To know more about outbound call, please see our blog Build and Implement Personalised Voice Notification service

An IVR system with DTMF and speech recognition can provide several benefits to businesses, including faster call resolution, reduced call volumes, personalized support, and improved customer experience. EnableX’s interactive voice response system offers personalized support by combining DTMF and speech recognition, allowing customers to navigate through the system using touch tones or by speaking naturally. 

Applications of interactive voice response 

Interactive Voice Response (IVR) systems are commonly used in a variety of industries and applications. Here are some examples: 

  • Healthcare: IVR systems are used in healthcare to provide patients with appointment reminders, medication reminders, and other health-related information. Patients can interact with the system to confirm or reschedule appointments, or to refill prescriptions. 
  • Customer Service: IVR systems are widely used in customer service to provide callers with automated assistance and routing. Callers can be directed to the appropriate department or agent based on their input or request. 
  • Banking and Financial Services: IVR systems are used in banking and financial services to provide customers with account information, transaction processing, and other banking services. Customers can interact with the system to check account balances, transfer funds, or pay bills. 
  • Surveys and Polling: IVR systems are used for surveys and polling to collect feedback from customers or to gauge public opinion. Callers can interact with the system to answer questions or provide feedback on products, services, or policies. 
  • Marketing: IVR systems are used in marketing to allow customers to request information, enter contests, or make purchases. Callers can interact with the system to request brochures, coupons, or product information, or to enter orders for products. 
  • Order Processing: IVR systems are used for order processing to allow customers to place orders for products or services. Callers can interact with the system to enter order information and payment details to complete a purchase. 

Personalized Support in healthcare industry with DTMF and Speech Recognition 

DTMF and speech recognition can be used to provide personalized support in IVR systems: 

DTMF for account identification: When a patient calls an IVR system for healthcare support, they may be prompted to enter their registration number or other identifying information using their phone’s keypad (DTMF). This allows the system to quickly and accurately identify the patient and access their medical history. 

Speech recognition for natural language commands: Once the patient is identified through DTMF, he may have prompted with a message such as, “How can we assist you today?” The patient can then speak their request or query using speech recognition. For example, the patient might say “I need to refill my prescription,” or “Can you tell me the results of my recent lab test?” The system can use natural language processing to interpret the spoken words and respond accordingly, providing personalized and conversational support. 

Customized menu options: IVR systems can also use DTMF and speech recognition to provide customized menu options based on the patient ‘s medical history. For example, if the patient has diabetes, they might be offered menu options to schedule an appointment with a diabetes specialist or refill their insulin prescription. 

Personalized responses: By combining DTMF and speech recognition, IVR systems can provide personalized responses to patients with personalized medical advice based on their symptoms.  

Best practices for building a healthcare IVR system  

Follow these IVR best practices to give patients the best possible experience when they call you: 

  • Fully integrated IVR system: Healthcare organizations should ensure that their IVR system is fully integrated with their electronic health records (EHR) and other healthcare technologies. This can help ensure that patient information is up-to-date and accurate, and that healthcare providers have access to the information they need to provide high-quality care. 
  • Use a friendly and reassuring voice: Patients may already be feeling anxious or worried about their health, so using a friendly and reassuring voice can help create a positive experience from the start. Consider professional voice-over recording or high-quality text-to-speech technology. 
  • Keep the IVR menu simple and clear: Patients may be calling in with a specific health concern or question, so keeping the menu short and easy to navigate can help them quickly get the information or support they need. Aim to keep the main menu under 30 seconds and provide clear options to direct patients to the appropriate department or resource. 
  • Offer a direct option to connect with a live healthcare professional: Some patients may require direct support from a live healthcare professional, so providing this option upfront in the main menu can help them get the support they need quickly and avoid frustration. 
  • Provide language options: Patients from different backgrounds and cultures may speak different languages, so offering IVR options in multiple languages can help meet their needs and provide a more inclusive experience. Consider adding language options based on the demographics of your patient population. 
  • Provide an estimated wait time: Waiting on hold can be frustrating, but providing an estimated wait time can help patients manage their expectations and be more patient. Consider providing updates on wait times at regular intervals or offering a call back option if the wait time is long. 
  • Use the waiting time to educate patients: While patients are waiting, you can use the opportunity to provide health education or promote additional communication channels, such as email or online chat. This can help patients feel more informed and engaged while they wait. 
  • Choose appropriate hold music: Hold music can set the tone for the patient’s experience, so choosing high-quality and calming music can help create a positive experience. Consider using music that is relevant to healthcare or soothing instrumental music. 
  • Integrate IVR with other channels: An integrated IVR system can help provide a seamless patient experience across multiple channels, such as online chat or email. This can help ensure that patient information and context is carried across channels and reduce the need for patients to repeat themselves. 
  • Regularly review and update the IVR menu: As healthcare information and practices evolve, it’s important to regularly review and update the IVR menu to ensure that patients are receiving accurate and up-to-date information. Removing outdated messaging or updating menu options can help keep the IVR system relevant and useful for patients. 

To improve their customer experience and increase efficiency, businesses should consider implementing DTMF and speech recognition in their IVR systems. This can help address common challenges such as long wait times and inaccurate call routing, and provide personalized support tailored to the customer’s needs and preferences. By doing so, businesses can improve customer satisfaction and build stronger relationships with their customers. 

For more detailed information, read our documentation

Are you looking for feature-rich APIs to build exciting solutions?
Sign up for free to begin!
Signup Cpaas API