Thursday, June 29, 2006

Left to the implementation

Having looked various experiments in the field of digital identity, I recently decide to use an interesting contact filtering feature offered by certain i-brokers. This service allows you to filter out only dully identified communications addressed to you from a web page link. As I was not interested in joining any particular community, I decided to create an i-name on a generic i-broker. The registration process went almost well until after giving my payment details the system spat and “application error (Rail)” page after briefly showing a receipt page with a transaction number… So much for my financial records.

Not stopping at this little detail, I logged into the i-broker using my newly acquired i-name, in order to setup the contact service. At this point the system warned me that the i-broker “is not quite ready” and as a consequence they “plan to have the new Contact Service up and running very shortly”. As a matter of fact I would have appreciated being given this information as a note on the registration page. This notice may have resulted in me deciding to postpone the registration process.

That made two disappointments in a row. The i-broker service had been through a beta period not long ago. So, I thought it worth sharing my experience with the i-broker team. I happily clicked on the “contact us” link, which brought me to a “this page does not exist” warning… Interesting navigation. At that point I went back to the previous i-broker beta site where the contact page was available. Obviously, the form to contact the technical team was using the “contact filtering” service whic was available in beta. The form provided an option to use my newly registered i-name. I duly reported my ordeal and submitted the form. At this moment the system replied telling me the i-name I just registered and logged in with was unknown … I was under the impression i-names’ major advantage was in using XRI to be resolvable everywhere.

Beyond the early disappointment created by this experience, I believe there are a few lessons to be learned. When deploying a public facing application, I believe it is paramount to provide a customer:

  • An easy an enjoyable user experience,
  • A clear indication of the features available at this moment,
  • A robust application framework rather than the latest hyped tools,
  • A working demonstration of the services provided.

Failing to demonstrate these simple ingredients of any web application is in my opinion one of the major cause of service disappearance and bad reputation. I often read in standards or specifications that “it is all left to the implementation”. This is excatly the point, and the responsability of the implementers. I personally would not like to see i-brokers disappear so soon.

Technorati Tags: , , , ,

Labels:

Wednesday, June 28, 2006

Roster remoting

But how does Romeo knows “a2ff356” is in fact Juliet? That was the offline comment to my post about XMPP transports addresses. Oviously the built in “name” attribute that is part of the roster management syntax can be used to initialize this information. It could hold Juliet’s address handle in the external network, or a nickname if the networks supports both address and nicknames. But none of the current transport implementation will allow to always force these parameters to their proper values, or allow them to be reset on demand.

Nonetheless, I am toying on this blog with practical ideas and suggestions that can be picked up and build into solutions to enhance the XMPP protocol functionalities. From the discussion thread that followed this question emerged a simple concept: why not let the transport manage its own part of an user’s roster?

Current XMPP servers’ implementations have a very centralized way of managing a user’s roster. This architecture design has led to many protocol extensions to accommodate a transport’s role as an external user proxy. In effect, a transport “impersonates” an XMPP user into a different communication network. It provides the appropriate communication and presence translations, as well as an address handle mapping between the two communication spaces. The main requirement for using a transport is possessing an address handle in the target communication space. An XMPP transport to Gadu-Gadu would require the XMPP user to use a Gadu-Gadu ID in order to be recognised as a proper Gadu-Gadu user. If the XMPP user has used the other communication network before moving to XMPP, it will almost certainly want to carry over its previous contact list. If the XMPP user just wan to use the transport to communicate with a different network, it will certainly want to add new external contacts to its roster.

…what if the transport itself becomes responsible for managing its part of the user's roster?

Every transport to date has been designed around the centralized roster concept. Earlier on, a transport was silently updating the XMPP user’s roster at registration time. More recently, the XMPP protocol has been augmented to cater for the kind of bulk roster addition a transport was likely to produce. Now, imagine for a moment that the transport itself becomes responsible to manage its part of the user’s roster. In effect, the management of MSN users would be done by the MSN transport, the Yahoo! Users roster would be handled by the Yahoo! Transport, etc… In this architecture, the transport is responsible to provide the answer to the initial question. The answer would be easily carried in the name attribute of the roster result. The transport could offer administrative options to allow/prevent users to change the external network nickname and maintain a tight coherence between the XMPP and the external contact list.

In addition, delegating the roster management to the transport would also decrease the overall traffic between the home XMPP server and the transport, providing a specialized XMPP compliant distribution list for presence. An implementation allowing “roster delegation” would have to enforce an adequate level of trust between the transport and the user home server. This trust could result from two different contexts:

  • If the transport is a component of an XMPP server, then the trust would result from the configuration of the server and the component.
  • If the transport is a remote service provided by a different server, then the trust would be established through Mutual TLS authentication between the transport and the home server.

In the created trust domain, the home server would be able to decide to delegate the roster management to the transport. More generally, it is possible to extend this concept to any service providing a contact list management of some sort. This would extend XMPP using the traditional “simple client, complex server” approach. From a client perspective, nothing has changed. The client issues a roster request to its home server and receives a result. The way the roster is managed is entirely dependant of the server implementation. It is the server’s responsibility to discover “roster delegation” support and to aggregate the various roster parts into a complete roster result. The decision to use delegated roster management can also be coupled with the level of trust provided by the remote service. This would allow a server to opt out using “roster delegation” if the target system does not comply with a defined trust level, or reputation, although it provides the delegation feature.

Technorati Tags: , , , , , ,

Labels:

Monday, June 26, 2006

Addresses are for routing

I was reminded by a reader's comment that address handles are still given properties beyond their only designed role: addresses are for routing. That is, routing only. Any attempts at loading an address handle with an additional meaning is only creating confusion. Address handles are not identity tokens. Address handles must not provide application logic. This bad practice is the best way to create lock-ins and decrease applications' scalability and extensibility. An address handle must be used in a single context: to indicate the destination of a communication.

This foreword done, the comment originated from the need for proper conversion rules when translating addresses from a communication space into another. The example was taken from the current XMPP transport practices, where the non-XMPP address handles are encoded as the node part of a JID. I believe this practice is wrong and does not possess any good technical grounding.

In an XMPP addressing space, a component such as a transport will have a JID comprised of a domain part only. Let's say "transport.montague.net". If a user on the XMPP side of the transport has contacts in the legacy side, the common practice is to apply an encoding logic on the legacy address to build the node part of a JID representing that contact in the XMPP world. The result might look like "juliet%capulet_it@transport.montague.net". The issue with this conversion lies in the resulting urge by many developers to use this "logical" encoding of the node to derive a meaning in a client for example. If the contact JID had been "a2ff356@"transport.montague.net" the programmer would never had used the node for anything but its role. The opaque node keeps all the routing properties of an good address handle, which in this case requires that a stanza using this JID be routed to the "transport.montague.net" service. It is also easy to see that this approach is completely independent of the target legacy system.

Introducing a logic in the node encoding of early transports has

  • induced developers to reverse this logic inside their code, creating a de-facto legacy inside the XMPP clients and transports implementation,
  • imposed at a time different encodings because the target legacy systems have different addressing spaces syntax.

Lastly, I believe there is no good reason to use a logical encoding of the JID's node for legacy contacts. Finding the contact's address can be achieved by looking-up the opaque node value in a cached table to obtain the legacy address. In terms of performances, I believe the time difference between a lookup and a decoding does not count much when compared to the actual transport wire transmission overhead.

Technorati Tags: , , , ,

Labels: ,

Sunday, June 25, 2006

Usable interoperability

The IEEE directory describes inter-operability as

The ability of two or more systems or components to exchange information and to use the information that has been exchanged.

Well, if we take that definition into the human communication space, we could say that speech allow a good level of “protocol” level inter-operability. When we add the knowledge of a language, then we can exchange and hopefully make use of well-formatted (syntax) and meaningful (semantics) messages. What is in my opinion a little flawed in the IEEE definition is the part about usage of the received information. I believe inter-operability does not have a meaning outside the scope of a particular application domain.

In my previous post, I stated why I do not find great interest in the SIMPLE/XMPP “inter-operability” draft proposal. To re-enforce my previous position, I would say the draft describes proper messages syntax, even a beginning of shared semantic between the two protocols, but completely fail to put the inter-operability in context. Without the intimate knowledge of a shared human language, one can receive the perfectly valid speech flow without being able to use it. A bit similar to listening an opera where a Spanish tenor and an Italian diva sing in Russian. They exchange perfect syntax and semantic but with a limited use, mainly providing marks for the other to respond. Don’t you think we may be missing, like the spectator of the opera, a great part of the meaning?

Technorati Tags: , , , , ,

Labels:

Saturday, June 24, 2006

Unnecessary "interoperability" drafting

After some time out of the communication realm, I will comment on the presentation of an Internet-Draft that defines how to enable basic interoperability between SIP/SIMPLE and XMPP.

It is funny how interoperability and federation between different communication media is always reduced to "mapping". This is once again exemplified by the draft proposal trying to explain how to achieve this hypothetical interoperability. From the architectural assumption, we already get the feeling this is too good to be true. Being amongst the very few people having implemented a federation between these two protocols, and probably the only one having designed a native SIMPLE connectivity on an XMPP server, I can tell you this is slightly more complex than the draft describe…

The draft goes to great length to describe how to convert between a SIP URI and an XMPP JID. This is an excellent and accurate work, and its undisputable merit lies in providing in one single document a thorough description of each address' syntax. But in real life, what are the chances for the addresses in these two communication spaces to require direct translation?. Not very high. Why would the syntactical similarity between a SIP URI and an XMPP JID provide a guarantee that the same address must be used? After all, we do not translate JIDs directly to email addresses, nor SIP URIs either. We do not translate AIM screen names to SIP URIs, nor to JIDs. In the real world, we perform a lookup of one communication space's address in the other... We use directories to achieve the "mapping", rarely direct translation. The only case where this would be used is when SIP/SIMPLE user agents connect directly to an XMPP server, which is somewhat uncommon.

Where the draft is not up to the task is when trying to map the long term presence subscriptions of XMPP with the short term subscription of SIP. It is all fine when there are no existing subscriptions form either side. In this case, translating an XMPP subscription into a SIP SUBSCRIBE would provide the desired effect. But only the first time. The next time, the permanent XMPP subscription will trigger a presence stanza, instead of a subscription stanza. Nothing in the draft addresses how to handle this difference. In SIP we have a well delimited publish-subscribe behaviour. In XMPP we have a variable context behavior: before and after subscription.

The draft indicates that the XMPP subscription could be terminated by the SIP subscription. But this is not realistic. From an XMPP user's stand point this would mean re-subscribing to the contact on every login. And further on, this would force the XMPP user to accept a SIP contact's subscription for every contact’s log in.

Working the other way round is easier, as it is always possible to translate a SIP SUBSCRIBE into an XMPP subscription. If no subscription exists for the SIP contact, the subscription stanza will be sent to the XMPP user's client if and only if the user is online. If the user is offline, the subscription will be stored by the XMPP server, awaiting the user's acceptance. So the gateway will have to deal with this case, which is not described in the draft. Similarly, the draft does not handle the case where the XMPP user refuses the SIP contact's subscription. In SIP, if a subscription is refused it translate into an error response. The gateway would have to be aware of the presence state of the XMPP user to take the right decision. This asynchronous handling does not easily "map" with the request/response nature of SIP. Following the proposed scenario, the gateway would have to maintain a copy of every user's presence. This is both inefficient and non scalable. And again not very realistic.

As said above, SIP presence is based on short lived publish-subscribe, initiated by the requesting user agent. This is the opposite of XMPP, where the server acts as a proxy for the client user agent. SIP relies on expiration to cancel subscriptions. XMPP sessions do not time out. SIP is connectionless, whereas XMPP relies on the client connection state to deduce the end of a client session. All these fundamental differences are not highlighted in the draft, and as a consequence the cases described are both superficial and incomplete.

In the end, I find no real value in issuing a document whose only real complete description is in highlighting the difference between XMPP and SIP address handles representations. It is also detrimental to reduce interoperability between the two protocols as a simple translation between message types. None of the real issues of application interoperability are addressed in this document which looses all its substance, and by consequence its necessity. Beyond having “the longest title in IETF history” I believe this draft would be better off retracted. This energy might have a better result applied to the Jingle specification…

Technorati Tags: , , , , , ,

Labels:

Water does not have an history

As human beings we place absolute on the scale of our desires… We aspire to absolute references, because we desire comfort and satisfaction, "we want things in our life that don't change". In our relationships we try to relate to others through the best absolute reference we can. Through naming.

We humans name things because speech is our principal mean of communication. Other species rely on smell or sound to provide a reference, we use names. And naturally we will use these names to identify other persons.

Now that we have extended this concept into the digital world, we tend to continue using names to provide identity references. Absolute references. And we are making a mistake.

Octavian has ended the late Roman Republic, and created the early Roman Empire under the name of Augustus. Popes change names when they are elected. One changes name to "begin a new life". In every case, different names refer to the same individual. Octavian and Augustus are the same person. Apart from the color of his clothes, the pope is the same person before and after his election. Only the reference has changed, as the context in which the name is used has changed. And these references point to the same identity. The name is used to fix a reference in a particular context at a particular point in time. It is bound to evolve over time.

Translating this into the digital world, mapping identity and globally unique identifiers becomes an utopia. It simply denies the possibility for the reference to change. In the real world, even the 'baptismal' naming we receive on our birth is a contextual relative reference. The probability for it to change is lesser that other identity attributes in our life, but it may vary. Furthermore, this reference is only relevant in the context of using speech as a mean of communication. If the communication changes, the way we reference an identity also changes. When we see an acquaintance on the other side of the street, we are able to identify the person even without remembering its name. Hearing someone's voice over the phone gives us a different way to identify the person. Other context, other reference.

In the end we have to be careful when drafting digital systems dealing with identity not to let our aspiration for comfort reduce the multiple facets of identity to a minimum incompatible with the complexity of its expression. Naming an identity in today's' digital world comes from the way the technology works. We are just trying to fit the foot to the shoe here. This is a huge scope reduction!

As explained by John Burgess, names can convey meaning as well as fix references

… given the truth of Avogadro's view that water is the compound H2O, it could not have been anything else. A world where a substance of a different chemical formula filled the lakes and rivers would be a world where something other than water filled the lakes and rivers.

But in comparison with persons' identity, water does not have an history.

Technorati Tags: , , , ,

Friday, June 23, 2006

A vulgus pecum SLA

…Identity control by the end user includes expressing the risks incured by its misuse

The recent public release of Google spreadsheet has brought to surface worries and concerns for corporate data protection. In short, with this application, Google will be opening yet another hole into the already permeable membrane protecting corporate data.

Some may wonder if Google offers a service contract to protect data with Google Spreadsheets… So far, the agreement is limited to a software license agreement that disavows liability. No doubt there will soon be an option for corporate users who would want more reassuring terms, based on policy and regulatory requirements. This is usually referred as a service level agreement. In the business world, this document is used by an organisation to expresses its control objectives or risk losing control to a service provider. When the providers understand what is expected of them and agree to it, then and only then does it become appropriate to allow them to hold and manage sensitive data.

The important aspect of this regulation of the data handling comes from the origin of the SLA: the organisation is the one expressing the required level of control and the liabilities caused by a loss of control. And everyone agrees to consider this to be “best practices”. Then why on earth couldn’t we have the same for our individual identity data? Empowering the end user and giving him the ability to “regain control and express incurred damages” over its identity would also mean giving the user the possibility to define the terms of its expected service level agreement. Identity, its associated and its perception are entirely contextual. Only the identity’s owner is able to assess and express the real risks incurred by a misuse of this information. In effect, it is the service provider that should have to abide by the “vulgus pecum” SLAs! Not the end user agreeing to the provider’s often flawed and limited “privacy” policy. But this would be in an ideal world…

Technorati Tags: , , , ,

Me Too, Me Also, Me Copy

Andy Abramson has a story about Microsoft having a story on “presence enabled” communications. Andy Abramson is a little late on this one… I wonder why reactive experts are only blattering about the visible. And why above all, do they have hair thin memories.

The actual brilliance is not in Parus for using the phone call result as an indicator. Not using a dial tone when you are in the phone business would be a crime. The brilliance is not in Iotum for adding a web interface to a server based preference store and filtering the call session requests. I have explained in earlier post why I have always believed presence to be the next dial tone, and why Iotum is still short of providing the expected value. Citing these players is only a justification for a post about Microsoft being a little heavy on the dinosaurs scale. But everyone knows this, so much for the Me Too.

What Andy is missing about Microsoft is that they presented their road map and they are just executing on it. Everything about “presence enabling” communications and office application was written down when the “Real Time Communication” server was unveiled a few years ago. At the time, I agree, Microsoft was not making a great innovation. It was just building up the ideas laid down by the PAM (Presence and Availability Management) forum on how to leverage presence states and user based rules to derive an “availability”. That may not have been brilliant, but it shows that someone had picked up the idea and seen the potential. Andy did not, as far as I recall. That Microsoft execution time may be longer than in other companies, and that in the end the innovation has become a “dead ringer”, this is not a scoop. So much for the Me Also.

Andy is also missing the final point: Microsoft flawed execution will make it to every corporate America’s desktop. And that is brilliance. Not on technological innovation, but on human behaviour understanding. Lesson to be learned for a marketing professional?

Indeed, if imitation is the highest form of flattery, many out there will be thanking Andy for bashing Microsoft without constructive arguments. Maybe after all is it easier and more comfortable to “join the crowd”… So much for the Me Copy.

Technorati Tags: , , , ,

Wednesday, June 21, 2006

"Battelling" for privacy

Even if he denies it, John Battelle is discussing growing privacy concerns in this seemingly inocous post. But under the appearance lies an undisputable fact: privacy protection, be it written in laws or not, is at risk because of search.

Search provides a framework for thinking not only about mail (what is it about Gmail that makes it really unique? It's searchable…), but for our entire clickstream, which is fast becoming an asset.

What makes something private, beyond property, is the possibility to decide how that thing is exposed. As soon as there is exposure, privacy is breached. The first time you take your brand new car out of the garage, you make it public, and you loose part of your privacy. A letter is private when protected by the ephemeral barrier of the envelope, as it was yesterday by the seal maintaining the parchment roll. But what would we say about privacy, when the mailbox becomes searchable? Surely that is does not exist anymore.

What differentiate today's practices about personal data is the daily recording of growing quantity of "identifiable" information on centralized servers. The reasons are multiple. Simply put greed to access and monetize this information is a strong driver behind this trend. No doubt, recording of private information has existed for immemorial times. True, until recently it has been rather difficult to link together the different islands where that information was residing. Looking at what difficulties lies in reconstructing one's genealogy gives a good example of how "identified" information could be difficult to trace. But in the context of the web, these obstacles are vanishing. By looking at how people willingly release more "identified" private information everyday, this will become easier and easier.

One in ten internet users have registered at a social network, for example, and one in five have visited one. The reason for this shift is simple: innovative companies have figured out how to deliver great services (and make money) by divining clickstream patterns, be it a underlying divination, like PageRank, or a more direct one, such as AdWords or Amazon's recommendation system.

Needless to say, I believe this a result of abdicating any critical analysis of the associated risks. Adhering to a "social network" boost one's image by widening its exposure, and therefore by automatically decreasing one's privacy. Obviously, there is no "social networking" without releasing of a slice of personal "identified" information. Where would be the salt of life without gossip… And what is today's first mean of gaining exposure? Simply making sure this information appears on the top lines of search results… This is the first breach of privacy.

Behavioural experts will explain why the need for "exhibitionism" is common amongst humans, which makes me believe we are only at the beginning of the "social networking" adoption curve. The same experts will describe that the next step is to use distinctive signs and symbols. This is where the "innovative companies" come in to deliver "great services". Looking at it, how different is it to wear a tee-shirt with some well know brand, or to have it appear on your own "private" page. You don't get a dime out of it, the "social network" host does. This is the second breach of privacy. Obviously the "personalized" targeted add on your "private" page has been using your "identified" information…

More importantly, this seemingly disparate information, residing in the various service providers servers, can now be reconciled. Search can make these many data islands look like a wide open territory. The next frontier…
The privacy laws "fathers" have probably been blinded by technology experts in their laudable attempts at protecting this part of our identity. They where certainly told without a common database "key" personal information could not be correlated. Nowadays, with the growing power and refinement of the search engines, a common database "key" is not needed anymore, you only need a reference to an identity handle somewhere in a web page. The associated data does not need to be structured anymore. Search is slowly but surely circumventing all established privacy laws, with their outdated reference to unique keys.

Regaining control of our identity information and privacy starts by requiring that each of those "innovative companies" provides a "delete my information" button on the account management page of their services… It also requires that data held in their store be completely anonymized by using opaque indirect references. At least, if the data itself is not wiped out when I require it, it would only become part of the greater statistical whole. This is better than being "exposed" in public...

Technorati Tags: , , , ,

Labels:

Saturday, June 17, 2006

Weaving an identity society

…Society must evolve to become an integrated yet privacy-enabled medium for people's diverse expressions of identity…

When I commented on Luke Razzell’s blog I wasn’t really expecting more than voicing an opinion on the growing Microsoft influence in the identity and privacy space. Luke hinted at the paper he had been co-authoring with John Madelin of BT to address some of the biased “laws” set by Kim Cameron and “set out a a more comprehensive vision of an Identity Web”.

Since then Luke’s announcement has become a reality, and taken the form of an open discussion wiki. The Web is becoming a more human-friendly and semantically-rich space, where people will be able to negotiate control of online information sharing. An “Identity Age” is dawning. The Identity Society wiki is aimed at discussing the nature of the associated challenges in depth:

  • Lead with thesis,
  • Avoid polemics,
  • Be accessible to a broad range of stakeholders,
  • Drill down from broad philosophical context to specific user requirements and business cases

For those wondering, this is what kept me away from blogging the past week…

Technorati Tags: , , ,

Labels:

Friday, June 09, 2006

Lesson to be learned ?

Phil Zimmermann quietly released a reference application of his new cryptographic protocol ZRTP aimed at bringing privacy to VoIP conversations.

Zfone lets you whisper in someone’s ear, even if their ear is a thousand miles away. I think it’s better than the other approaches to secure VoIP, because it achieves security without reliance on a PKI, key certification, trust models, certificate authorities, or key management complexity that bedevils the email encryption world. It performs its key agreements and key management in a purely peer-to-peer manner over the RTP packet stream.

Beyond the point of certification authorities’ less than adequate certification procedures, it is rare when a user care to manage a secure list of accountable authorities in their clients. Today use of asymmetric cryptography in conjunction with PKI or trust models implies many risks and inconveniences, part of which is rooted in the necessity to securely managing private keys.

Zimmerman assumes that trust needs to be continuously renewed and updated between two parties in a conversation. Hence the ephemeral and point to point nature of his secure application. By decreasing the need for long term storage of sensible key material, it increases the overall security of the conversation. At the same time, I believe this is only solving part of the issue, as Bruce Schneier points out

No amount of IP telephony encryption can prevent a Trojan or worm on your computer — or just a hacker who managed to get access to your machine — from eavesdropping on your phone calls, just as no amount of SSL or e-mail encryption can prevent a Trojan on your computer from eavesdropping — or even modifying — your data.

So, as always, it boils down to this: We need secure computers and secure operating systems even more than we need secure transmission.

This is putting in better words than mine the objection I have to extend the JEP-0116 Encrypted sessions beyond online conversations. Will the XMPP community learn the lesson from these experts’ experience?

Technorati Tags: , , , ,

Tuesday, June 06, 2006

Trust Pot Pourri

…shortcomings may be corrected without great difficulties using ingenuity and pragmatism…

Beyond the announcement, the proposal put together by Peter to strengthen “trust” in XMPP leaves me perplex. Under the surface, the document looks like a hastily assembled “pot pourri” of ideas that have been laying around XMPP for some time. The document certainly provides a good inventory of the XMPP technologies available today the area of security. But I believe many of the described shortcomings may be corrected without great difficulties, by just using a good dose of ingenuity and pragmatism.

  • On the server side, I believe the major task would be convincing and educating many more Certificate Authorities to include the XMPP specifics into their server certificates. It is somewhat frustrating when you discuss with CAs and find out they do not conceive a server certificate outside a web server certificate, and a client certificate beyond a browser of mail client certificate.
  • Support for SASL External and TLS already exist in many open source servers, beside the commercial ones. Again, I do not think we have a technical issue to implement secure links between federated XMPP server, but rather a matter of organization and agreement to use these SASL/TLS instead of dialbacks. The weight of legacy maybe?
  • It maybe interesting to use a phased approach to reach the ultimate secure federation. In a first phase, SASL/TLS could be applied with web server certificates. Then later, when real XMPP certificates become widely available, they will be substituted. But, I believe supporting web certificates and the associated SASL checking is a must, as certain commercial CA may delay on purpose providing real XMPP certificates. We must also account for corporations or agencies that would use open source XMPP servers with certificates from these CAs.
  • As discussed earlier, the number of end-to-end encryption implementations in XMPP is negligeable. I have described how the current "encrypted sessions" method, which is an XMPP implementation of the Off-the-Record messaging , goes beyond what is necessary to ensure a very good end-to-end encryption. In particular, I find the section related to “offline secure session” superfluous. It also introduce higher risks of key being compromised than a version dedicated to “online secure sessions” only. The JEP-0116 specification is an excellent starting point, but would greatly benefit from being simplified. After all, hasn’t XMPP always wanted to be associated with simplicity?
  • I am convinced every XMPP server can become a repository for public keys. XMPP already allows a number of “personal” data to be published through Personal Eventing Protocol and other means. I do not really see what major difficulty there would be in extending these existing mechanisms for public keys publishing on the users’ home XMPP server. In my opinion, this would be an easy first step in creating the infrastructure required when stanzas signatures will be introduced.

I am more sceptical when the document describes “greater reliability of client-to-server and server-to-server connections” as a component of increasing an XMPP network’s trust. Reliability is important, but unless we do word plays, this is a matter of risk assessment, not a matter of trust. I am not convinced XMPP suffers in that area when compared to other public IM protocols. XMPP is certainly not suffering in the comparison to SIP/SIMPLE. In this protocol, un-delivered messages trigger an error. But the protocol does not provide a way to recover after an error. This is left to the implementation… Applications of XMPP beyond IM may certainly require and benefit from high reliability, but the examples given in the document fail to convince me this could greatly increase my trust in the XMPP network.

Similarly, I understand that administrators would benefit from a better monitoring of distributed XMPP servers. But, presenting this as an important factor to increase the trust of an XMPP network is probably exaggerated. In real life, when an application provides reliable services, end users will be attracted to use it. As soon as the service starts to degrade, they will move away. They will not need statistics about uptime, they know who is providing best services by word of mouth. Statistics are for administrators and marketers… Statistics on uptime do not re-enforce trust, they provide justifications. From a technology stand point, creating a binding of SNMP inside XMPP could present an interesting challenge. But I do not believe XMPP will quickly displace other network monitoring technologies. That said, if the “monitoring” only consist in performing authenticated login scripts against a list of servers, using a simple scenario for Tsung will provide a quick and easy answer… As the results can also be easily archived, we would only need a graphical interface to present the statistics.

…a trust reputation system will require more than a $5000 budget…

I have left the “reputation system” for the end. Obviously providing a “reputation system” is very fashionable these days… Believe me, that will require a little more than a $5000 budget. But it is late, and I’ll come back to this very interesting subject later.

Technorati Tags: , , , ,

Labels:

Sunday, June 04, 2006

Chatting the last mile

I have described in a previous post how MUC chat rooms can be distributed to achieve Internet wide scalability. The initial overview focused on an easily implemented architecture aimed at limiting message broadcast for users hosted on different home servers than those providing the MUC chat service.

This model can be extended in an architecture where the distributed MUC brokers are stand alone implementations, separate from the actual home servers. The MUC scalability remains, and is still achieved by using the inter broker room subscription. But in this scenario, the MUC overlay network is not co-located with any particular XMPP server. Certain MUC brokers can be directly managed by domains owners, such as in the case of the powerful capulet.net and montague.com. Other MUC brokers may be managed by third parties having negotiated peering agreements with domain owners to get access to specific room content.

This architecture retains the advantage of the original MUC overlay network at the core. But, inefficiency re-appears in the local loop, where traffic leaves the overlay network toward the various home servers. Fortunately, the last mile traffic can be minimized by leveraging existing XMPP standards. In this case, the messages traffic between an edge MUC broker and a particular home server will be optimized by using JEP-0033 Extended Stanza Addressing to multiplex each broadcasted message.

This architecture further emphasized how XMPP can be used to implement a robust Internet wide multi user chat network. Again, this is achieved without modifying the protocol, by using existing standards and extending their usage context.

Technorati Tags: , , , , ,

Labels:

Protecting identities with XOR

When the average user speaks of security, encryption often comes first in the conversation. While encryption is an important component of security, it only deals with a subset of the challenges inherent to communications security. This is missing the bigger picture. Cryptology covers primarily the confidentiality of transmitted data, while the identities of the communicating parties remain unprotected. On the Internet, this shortcoming has been addressed by a technology called “onion routing” (OR). OR as a technology has been around to provide anonymous communication for some time. TOR, its second generation, is adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies to the original design, while avoiding infringing patented patterns used in the original design.

Onion Routing is a distributed overlay network designed to anonymize TCP-based applications like web browsing, secure shell, and instant messaging. Clients choose a path through the network and build a circuit, in which each node (or “onion router” or “OR”) in the path knows its predecessor and successor, but no other nodes in the circuit. Traffic flows down the circuit in fixed-size cells, which are unwrapped by a symmetric key at each node (like the layers of an onion) and relayed downstream.

I have been toying with the idea of applying the principles of OR to XMPP at the protocol level to provide end to end anonymity of conversations. I am presenting the overview of how XOR (XMPP Onion Routing) could be implemented on top of XMPP. TOR enhance the original OR design of layered cryptographic payloads by introducing “telescoping circuits”. This type of circuit allows an initiator to negotiate a short-lived session key with each successive node along a path, and use this key to encrypt the onion layers. Diffie-Hellman is used for key exchange. The "telescoping circuit" model brings two important benefits on top of forward secrecy. First, it allows multiplexing multiple conversations over a single circuit, letting the initiator to set up a single long lived path across the Onion Routing network, which can be use to access multiple destinations. Second, it allows “leaky pipe” routing. Any node along the path is a candidate exit point, making traffic analysis particularly difficult. Individual Onion Routers can be configured as exit nodes, and the type of traffic they will allow to exit specified.

The Tor network is an overlay network; each onion router (OR) runs as a normal user-level process without any special privileges. Each onion router maintains a TLS connection to every other onion router. Each user runs local software called an onion proxy (OP) to fetch directories, establish circuits across the network, and handle connections from user applications. These onion proxies accept TCP streams and multiplex them across the circuits. The onion router on the other side of the circuit connects to the requested destinations and relays data.

Applying this model to XMPP is easy. The XOR network is an overlay network, implemented using XMPP S2S connectivity. Each XOR router will use a TLS connection to reach every other XOR router participating in the overlay network. XOR will require an extension to the protocol to allow a XOR client to discover XOR support, and browse the overlay network nodes directory. XOR clients and nodes should use JEP-0116 encrypted sessions as the base mechanism to establish the “telescoping circuit” layers along the chosen path. XOR nodes will accept XMPP stanzas streams and multiplex them across the circuits. The XOR node at the other end of the circuit connects to the requested destination and relays the stanzas.

Instead of taking a direct route from source to destination, data packets on the Tor network take a random pathway through several servers that cover your tracks so no observer at any single point can tell where the data came from or where it's going. To create a private network pathway with Tor, the user's software or client incrementally builds a circuit of encrypted connections through servers on the network. The circuit is extended one hop at a time, and each server along the way knows only which server gave it data and which server it is giving data to. No individual server ever knows the complete path that a data packet has taken. The client negotiates a separate set of encryption keys for each hop along the circuit to ensure that each hop can't trace these connections as they pass through.

In the now traditional Shakespearian way, When Juliet wants to have a secure and anonymous conversation with Romeo, she will chose a random path between the capulet.net and the montague.com server from the retrieved XOR directory information. She will then negotiate an ephemeral session key with each XOR node along the path, including Romeo's client. She will then build all the onion layers to encapsulate her message and deliver it to her own XOR server. Each XOR node on the path will peel off its own layer, decrypting it with the ephemeral key negotiated with Juliet, and route the remaining layers to the next node. Upon receiving Juliet's stanza, Romeo will be able to build its own “telescoping circuit” to respond to Juliet. This circuit will use a randomly chosen path to Juliet's client, which will be different from path taken by Juliet's stanza to reach him.

As stated at the beginning, this is an overview of how the OR model can be applied to provide anonymous conversations on top of XMPP. I believe this technique added to the encrypted sessions negotiation will provide additional protection of the identities involved in a particular conversation.

Technorati Tags: , , , , , , , ,

Labels: ,

Friday, June 02, 2006

World wide chat

Earlier this year there have been debates on the JSF mailing lists about ways to reduce traffic flowing between XMPP servers hosting multi-user chat rooms. In today’s common way of providing chat rooms using XMPP, a particular community's room could attract interest from users hosted on many other servers. The existing MUC protocol design assumes a chat room is a hub, and consider any participant to be directly connected to the room. In effect, participants may reside on different home servers than the server hosting the chat room. The discussion initiated from finding that the message distribution was sub-optimal between servers. Because the room itself is in charge of sending a copy to every participant in the room, a posted message would in turn generate a heavy traffic on the server to server links when it is relayed to all participants. To illustrate this phenomenon, let's use the now traditional Shakespearian scenery. In our days of reality shows, Juliet's balcony has morphed into the latest chat room in town. This way, all their fans can follow the intrigue in real time. When their story was still private, messages were flowing unnoticed between the capulet.net and the montague.com servers. Ever since Juliet has opened the balcony chat room, the capulet.net server has difficulties forwarding the lovers' message to all their eagerly awaiting fans.

Romeo and Juliet just hit an ever repeating truth: hub and spoke architectures simply do not scale to the Internet size. Only distributed architectures can take full advantage of the Internet. After all, the Internet itself is build by using distributed router and servers nodes.

If we take a step back, XMPP can be broadly described as a publish-subscribe protocol. It has built in mechanisms to notify subscribers of events occurring in three particular contexts, namely:

  • at the core, users are able to subscribe to receive presence or personal events states change from other contacts residing on any XMPP server,
  • in MUC, users are able to enter a chat room, and doing so subscribe to receive all messages posted to the room,
  • in PubSub, user are able to subscribe to different objects of interest and receive notifications whenever a publication matches the subscription filter.

In essence, XMPP as a protocol provides two specialized and one generic publish-subscribe mechanisms. The problem arising from scaling publish-subscribe systems has been a long time subject studied by the academic world. And all scholars have agreed distribution was the only valid architecture to scale publish-subscribe systems to the Internet size. To make it short, the best practices in distributing a publish-subscribe system consist in:

  • building a meshed overlay network comprised of core publish-subscribe routers and brokers,
  • connecting subscribers and publishers to the edge of the overlay network.

I believe it is possible to implement a scalable distributed MUC implementation without modifying XMPP. We can achieve Internet size scalability by organizing chat rooms by subjects and create MUC peering agreements between the rooms. Going back to the best practices in publish-subscribe architecture, we want to limit the broadcasted traffic to a minimum. Translating this requirement in the context of a MUC overlay network, we would simply achieve this result by:

  • having the users connecting to the nearest MUC room of interest. This way the traffic forking to reach every individual user only occurs on the users home server.
  • having all the rooms sharing the same interest being subscribed to each other. This ensure a message posted in any room across the overlay network is only forwarded once between rooms, thus achieving the expected traffic reduction.

In a first step, this would be best applied to widespread common interest public communities. Without existing mechanisms, the rooms cross subscriptions will have to be manually configured and set up across any group of servers wishing to share a common interest. A natural extension would be to build a cross subscription mechanism in the MUC implementation themselves. As a room is identified by a JID, there is no protocol limitation preventing a room from being participant into another room. It is a matter of building into MUC implementations the possibility for a room to join as a participant another room on another MUC server.

In a second step we would be to extend the existing MUC service discovery definitions to include the relevant vocabulary allowing this distributed rooms of interest to be exposed a such to clients. This is a matter of registering new items for the XMPP disco protocol.

This solution will greatly decrease un-necessary server to server traffic. This can be done without relaxing any of the MUC built in moderation features. This approach makes XMPP a good candidate to implement a robust Internet wide multi user chat network able to supersede other technologies such as IRC. And this can be achieved without modifying the protocol. Isn't this what re-use and leveraging is all about?

Technorati Tags: , , , , ,

Labels: