Wednesday, May 24, 2006

Decreasing the prejudice

I have been venturing into the wide identity space for a very simple reason. Although I do not pretend being an expert in this field, I believe using a communication address as an identity GUID is detrimental to the identity system relying on this identifier, and to the communication system that is using this address for routing. Communication systems are more my cup of tea. While researching the subject, I came across the latest fashionable sub spaces on the subject of identity, namely “digital reputation”. In a previous post, I explained what makes me suspicious about this concept in the very wide sense. This definition of reputation in the context of blog comments may reconcile me with some of the possible applications of digital reputation. Actually I find the authors idea of “reputation system” interesting.

reputation is a 1 to 1 relation between the reputed party (the blog commenter) and a reputation-relying service (the blog)

This is a much narrower definition than the more generic, all contexts inclusive, definitions I have been exposed so far. I have no difficulties extending this functional definition to any “reputation system”.

reputation is a computed value, the computation is performed by a reputation-asserting authority (the reputation manager service) from a collection of transactional data (acceptances and rejections of previous comments to other blog posts) and identity data originated by past interactions of the reputed party with reputation-relying services (other blogs).

I find this second functional definition a little trickier. Obviously, as a critical human being, I understand the underlying expectations buried in such definition. I also understand its limitations:

  • the reputation system relies on “identity data”. In effect, the reputation system will be entirely dependent on the unicity of this “identity data” in order to compute the “reputation value” and keep a record of the “past interactions”
  • to comply with privacy protection, the reputation system would certainly have to work on opaque tokens, instead of more explicit tokens such as mail addresses. This is the only way to ensure a minimum level of “identity independence” in the calculation process.
  • the entire system will be bound to the pre-requisite that the spammer would play by the rules of good behavior. More specifically, it implies the spammer will behave in such a way the “reputation system” will always recognize it as the owner of a previous “identity data”. The game may becomes slightly more difficult if the spammer keep changing identity data.
  • the “identity data” correlation problem becomes rather complex once the “reputation system” would require access to “interactions” outside the boundaries of a single blog. This aggregation can be eased up by setting up the “reputation system” as an external party. In this case, the token must only be thought off as a reputation token, and not an identity token.

Approaching the problem from that angle would make me feel more comfortable. I believe the blog “reputation system” could be built upon presentation of this reputation token, as long as this token does not provide direct mapping to the owner’s identity. After all, the blog is only interested by a repetitivity and scaling quality index on the posted comments. Using an identifier, and not an identity, to reconcile and group together the collection of transactional data used to compute that index is a perfectly valid approach. But it has the invaluable advantage of protecting the privacy of the comments’ author.

I disagree on several counts with Dick Hardt when he tries to mush up identity and reputation. But what could I expect from a blog trying to describe a “next generation identity”? Identity is our essence. Outsiders will use multiple attributes to incompletely describe how they perceive our identity in their own ontology. Our identity may possess certain innate attributes that are more permanent than others, but we do not usually disclose them easily. As a matter of fact, in most cases, identity attributes are imposed on us. Defining identity broadly (identity == reputation, identity == transactions, etc.) infers that we lose the ability to do anything useful with it. In particular, privacy concerns gets much broader. So, when someone tries to over simplify, and come up with the blunt statement that reputation is identity, he earns a bad value in my “reputation system”.

With respect to comments on a blog. We envision the commenter needing to build up a reputation over time, and it would be associated with a particular persona. Since it takes a sequence of good behavior to build a positive reputation, there is a cost to that reputation, that good netizens will want to preserve if having a good reputation provides additional value.

Once again, another “good intent”. I understand the requirement, but I would only agree on a resulting system based on contextual reputation, and not identity. His position is pre-supposing that a contextual reputation can be extended to any ontology and/or context. Because every context and ontology pair define differently what a “good behavior” contains, they will definitively lead to several different contextual reputations. Each of these reputations may be useful in their own context. Quickly reducing all these contextual reputations into a single manichean net-wide reputation is a little far fetched…

