Sunday, November 05, 2006

Media relay: the reTURN

I am coming back to the subject of using media relay proxies for hard to solve cases of NAT (Network Address Translation) traversals. From the ensuing discussions both on the JSF mailing lists, and through private mails, I have gathered two points which in my opinion illustrate perfectly why protocol definition requires an open mind of every moment. The first opinion was expressed as

I think we can both agree that relay is bad; it adds latency and requires server resources.

and the second as

Since you're already talking to a STUN server for ICE, it makes sense to get your media candidates from the STUN server (i.e., via TURN).

In the first case, the author is using what the list of "Fallacious Arguments" describes as a mix of Argument by Question and Changing the Subject to displace the problem. In his introduction to TURN, Jonathan Rosenberg expresses the fact that a media relay function comes at a cost:

Though a relayed address is highly likely to work when corresponding with a peer, it comes at high cost to the provider of the relay service. As a consequence, relayed transport addresses should only be used as a last resort.

But he does it in context, as someone writing a protocol enabling a media relay function. He does not question the validity of doing media relay, he merely points out that this is expensive, and he goes on explaining how, in his view, the protocol must be architected to allow the feature. We may have different opinions on how to tackle certain issues, but I respect his professional approach at writing protocols: he does not take a subjective view and question the legitimacy of using media relay, instead he propose a solution on how to do it. He does not presuppose a single usage of the technology; instead he admits that a use case may exist. He does not displace the problem to avoid it; he merely tackles it.

In the original discussion that led to the first point, the ICE negotiation used by Google in its version of Jingle was said to be successful in 92% of NAT traversal cases without the assistance of media relaying. Although this is a tremendous achievement, it still leaves out 8% of edge cases where this technique is ineffective. Taken from Google's perspective, this may look sufficient to serve its own users' population. But taken from a protocol perspective, it makes describing how the protocol must be extended to cater for these other cases necessary. And, following the concept of Argument from Authority, I will quote Jonathan Rosenberg again:

… if a client is behind a NAT whose mapping behavior is address or address and port dependent (sometimes called "bad" NATs), the reflexive transport address will not be usable for communicating with a peer.
The only way to obtain a transport address that can be used for corresponding with a peer through such a NAT is to make use of a relay.

I believe that at this point we have established media relaying as an integral part of the arsenal available to the developers to perform NAT traversals. Moving on to the second opinion, it is somewhat reminiscent of Causal Reductionism. Behind the very respectable attitude which consists in not re-inventing the wheel, lies a bias toward a particular solution because the author does not consider in which particular context this solution has been developed.

If we consider the context in which TURN was designed, I can safely pretend that it was strongly influenced by the SIP standardization effort. In essence, TURN was created to address a very real shortcoming is the signaling protocol: it has the same NAT traversal issues as the media streams because SIP allows the use of UDP as its own transport. This consequently implies that media relaying can only be discovered and negotiated on the same transport as the media itself. In comparison, Jingle does not experience the same shortcomings as SIP, as XMPP is vastly superior at dealing with NAT and firewall traversal. As a result, to the contrary of SIP, Jingle can be leveraged to provide the same answer than TURN to the question: what relaying IP:port address should my client use as a last resort?

A careful study of the TURN draft exposes how the shortcoming of the signaling layer (SIP) forces a large part of the signaling to be reincorporated inside the media transport. The solution proposed in the TURN draft also forces the TURN server to authenticate the streams, and by consequence to have access to user's authentication data. Knowing that a TURN server has to be exposed on the public Internet, I can only imagine how corporation will react at this requirement.

On the other hand, an XMPP extension to query and allocate an IP:port address on a media relay present at least two advantages:

  • It would benefit from the inherent trust established at the user's login time on its home XMPP server.
  • It would greatly simplify the media relay server, as the implementation of TURN is complex.

To conclude, I believe that the simplest solutions are always better at solving technical issues. In this particular case, the complexity of a TURN solution is not required to provide the expected service. A simple XMPP query can be devised to provide the same result.

Furthermore, I am certain that Google offering a VoIP solution in its GTalk service was the result of a thoughtful business process. But at the same time, Google hands off approach to motivate its developers, has left the implementation been only driven by a technical view point. When specifying the Jingle protocol, a problem arises if the same developers are also co-authors of the Jingle specification: they remain framed within the Google context of consumer oriented service and do not seem to apprehend the wider complexity of the enterprise. But it takes time to be able to recognise the difference between enabling solutions and building frameworks…

Technorati Tags: , , , , , , , ,

Labels: , ,