Apr 12, 2010

Meta-Mining

Some so-called 'internet-prophets' bemoan the increasing volume of web-babble, the deluge of chatter, the hollowness of the information-tsunami. Big words of cultural pessimism that are gratefully picked up by the media.
Those web-critics have a serious problem: they try to *read* all that.
Would they go into a library and start reading the very first book on the shelf? I hope not. When they open Encyclopedia Britannica (yes there are some printed versions around) do they start reading on page 1? Some try to survive in the web by suggesting a new order of information - an ordering according to the date of appearance - the life-streams (see David Gelernter on Edge.org) . This would be an order in time instead of 'space' (where data are conventionally mapped out in different 'locations' on your screen or hard-drive).This approach to clean the data-mess is reminiscent of the cleansing of Augias' stables by diverting the River Alpheus. It's an honorable and classic approach - but does it solve the problem?
Let's look at Twitter. The deafening babble of tweets is already organized in life-streams. Read them live and you will drown.
The solution - besides filtering (friends, topics, lists, labels...) - can not lie in organizing the individual byte-series along one or the other axis (time, space, size, language...), the solution will rather be a mining of the meta-information. If a twitterer posts the unavoidable 'I am off to the loo, be back in a minute', this might only interest the one waiting for a response. If she posts that 20 times a day, we get some additional information: there might be the indication of a physiological problem.
Some meta-mining of tweets is approaching commercial relevance as reported by Jessica Guynn and John Horn in the LA times of April 2, 2010. Computer models based on Twitter chatter, they write, are stunningly accurate in predicting the box-office success of Hollywood movies.
If in the web to be the noise of individual utterances will be systematically analyzed for overlying macro-structures and for phase-transitions from the purely random to the organized, there will be more information gained than individually and knowingly put in. The sheer boundless chatter of Twitter and alike corresponds to the cells, the web is the organism.
If we continue looking at the lion through a microscope, we might get a pretty good understanding of his cells and the breathtaking number of them - but we might miss that we are just about to get eaten.

No comments: