WebRTC, since its introduction in 2011, has become quite an indispensable part of real-time communications in most enterprise applications. It is a free, open-source technology that allows peer-to-peer communication between browsers and mobile applications. With APIs, it allows developers to embed messaging, embed voice and even embed video calls directly into applications. This gives users the power to communicate from within their primary web interface without the need for any specialized hardware or complicated plug-ins.
Also read: The Most Comprehensive Guide on WebRTC
The need for multiparty calls
WebRTC, by itself, can only provide peer-to-peer communication on the browser. To enable multiparty calls, an intermediate server is required in order to receive and send the media. Out of the many topologies that can be used for this purpose, MCU (Multipoint Conferencing Unit) and SFU (Selective Forwarding Unit) are the two most widely used by vendors.
MCU is a mixing topology with the architecture designed around an MCU. It provides a single one-to-one stream with each participant. The central element then receives and mixes each incoming audio and video stream to generate a single stream out to every participant. This requires a relatively higher infrastructure cost. Additionally, because mixing requires decoding and re-encoding, this introduces extra delay and loss of quality
SFU, on the other hand, can send multiple streams to the participants. Based on a routing topology, each participant sends its media to a central server (i.e. SFU) and receives media from all the other participants via the central server simultaneously. SFU allows you to overcome the challenges encountered during an MCU development while giving enough flexibility to the end-user to control the individual participant streams, he/she receives from everyone else in the session.
The bigger question here- if all RTC platforms are using WebRTC as their backbone, where lies the difference? There are several decision points that need to be considered when choosing a service provider/developing your own solutions:
1. Bandwidth Adaptability- the use of Simulcast
With multiparty conferencing, there can be challenges related to bandwidth capabilities that will need addressing. For example, in a 10 party HD video call, each participant will require a whopping downlink speed of 13 Mbps (9 video media x 1.2Mpbs for a 720p video).
Simulcast makes video encoding and forwarding more flexible by generating different versions of the same stream (based on the resolution) and then using SFU’s intelligence to decide which stream is forwarded to which user depending on the end-user bandwidth. Hence, it is critical to choose a service provider with simulcast capabilities to avoid an unnecessary surge in bandwidth requirements.
2. Single or Multiple RTCPeerConnection
Let us understand this with an example. Suppose there are 4 participants in a video call and each participant is sending one and receiving 3 video streams from the other participants. These streams go to the SFU server which decides how to send these forward. Will it be better to create 3 RTCPeerConnection objects for each participant or squeeze all the streams together in a single object? Well, both have their advantages and disadvantages. A single object will mean lesser network overhead but can also mean that each time a user enters or leaves the conference, the whole session will need to be reworked for all participants. Whereas for a multi-object solution, we shall have the flexibility but the network will incur overheads.
3. Communication with legacy users who are on PSTN/VoIP
There will still be users who are on the legacy communication endpoints like SIP/PSTN etc and hence, it becomes an imperative requirement for a browser-based technology like WebRTC to engage with external communication devices like telephones. There is no standard defined for signalling that might be required for establishing a connection in such a scenario, in WebRTC. What is required is to build a WebRTC-SIP gateway to enable this communication, either in-house or via an external service provider.
Again, a checkpoint when considering a solution/vendor to match your requirements on the ground.
4. How flexible is the solution?
Whether it is the layout in which the participant is receiving the stream or the capability of a WebRTC solution to integrate with external/existing solutions, it proves to be a bit rigid in both. As the number of participants or contextual communication demands increases, users might find WebRTC limiting when being used as a standalone solution.
5. Compatibility with Future technology
The need for a more efficient and optimized solution never gets old. This is why a lot of vendors today are engaging in technologies like VR/AI to enhance their offerings and make them as dynamic as possible. They are working towards making smarter real-time communication tools and also, to add analytics to these capabilities via ML/NLP.
WebRTC might not have the capability for this enhancement and might need an added platform to be in sync with the evolving technologies. The same goes with the adaptability with the latest features of browsers and with various unforeseen scenarios like fluctuating bit rate or packet loss etc.
Clearly, a stand-alone WebRTC application helps you only at a very basic level and needs enhancements to overcome the above-said challenges. Hence, depending on your use case, industry, the extent of use etc.; it is very relevant to choose the right partner for your firm. The solutions and enhancements provided by your vendor should be able to match your requirements. Here’s how the EnableX platform helps you build a better communication platform.