Wednesday, June 21, 2006

"Battelling" for privacy

Even if he denies it, John Battelle is discussing growing privacy concerns in this seemingly inocous post. But under the appearance lies an undisputable fact: privacy protection, be it written in laws or not, is at risk because of search.

Search provides a framework for thinking not only about mail (what is it about Gmail that makes it really unique? It's searchable…), but for our entire clickstream, which is fast becoming an asset.

What makes something private, beyond property, is the possibility to decide how that thing is exposed. As soon as there is exposure, privacy is breached. The first time you take your brand new car out of the garage, you make it public, and you loose part of your privacy. A letter is private when protected by the ephemeral barrier of the envelope, as it was yesterday by the seal maintaining the parchment roll. But what would we say about privacy, when the mailbox becomes searchable? Surely that is does not exist anymore.

What differentiate today's practices about personal data is the daily recording of growing quantity of "identifiable" information on centralized servers. The reasons are multiple. Simply put greed to access and monetize this information is a strong driver behind this trend. No doubt, recording of private information has existed for immemorial times. True, until recently it has been rather difficult to link together the different islands where that information was residing. Looking at what difficulties lies in reconstructing one's genealogy gives a good example of how "identified" information could be difficult to trace. But in the context of the web, these obstacles are vanishing. By looking at how people willingly release more "identified" private information everyday, this will become easier and easier.

One in ten internet users have registered at a social network, for example, and one in five have visited one. The reason for this shift is simple: innovative companies have figured out how to deliver great services (and make money) by divining clickstream patterns, be it a underlying divination, like PageRank, or a more direct one, such as AdWords or Amazon's recommendation system.

Needless to say, I believe this a result of abdicating any critical analysis of the associated risks. Adhering to a "social network" boost one's image by widening its exposure, and therefore by automatically decreasing one's privacy. Obviously, there is no "social networking" without releasing of a slice of personal "identified" information. Where would be the salt of life without gossip… And what is today's first mean of gaining exposure? Simply making sure this information appears on the top lines of search results… This is the first breach of privacy.

Behavioural experts will explain why the need for "exhibitionism" is common amongst humans, which makes me believe we are only at the beginning of the "social networking" adoption curve. The same experts will describe that the next step is to use distinctive signs and symbols. This is where the "innovative companies" come in to deliver "great services". Looking at it, how different is it to wear a tee-shirt with some well know brand, or to have it appear on your own "private" page. You don't get a dime out of it, the "social network" host does. This is the second breach of privacy. Obviously the "personalized" targeted add on your "private" page has been using your "identified" information…

More importantly, this seemingly disparate information, residing in the various service providers servers, can now be reconciled. Search can make these many data islands look like a wide open territory. The next frontier…
The privacy laws "fathers" have probably been blinded by technology experts in their laudable attempts at protecting this part of our identity. They where certainly told without a common database "key" personal information could not be correlated. Nowadays, with the growing power and refinement of the search engines, a common database "key" is not needed anymore, you only need a reference to an identity handle somewhere in a web page. The associated data does not need to be structured anymore. Search is slowly but surely circumventing all established privacy laws, with their outdated reference to unique keys.

Regaining control of our identity information and privacy starts by requiring that each of those "innovative companies" provides a "delete my information" button on the account management page of their services… It also requires that data held in their store be completely anonymized by using opaque indirect references. At least, if the data itself is not wiped out when I require it, it would only become part of the greater statistical whole. This is better than being "exposed" in public...

Technorati Tags: , , , ,