May 13, 2014

Don't call Big Data a Revolution

Everybody in science seems to love Big Data. Put "Big Data" in your grant proposal and your file gets on top of the pile. Sure, some had the suspicion that funding for operating with big data went up because those nerds in the basement of NSA need some help sifting through cassettes of indiscriminate tapping into every utterance of every two-legged creature on earth. Those losers obviously lack the brains to ask the right questions and to target a reasonable subset of mankind - so they just grab everything they get. And stay as blind as they were before. Of course it is difficult to find a needle in a haystack - but why dump all that hay on the needle in the first place?
This aside, there are believers like Kenneth Cukier and Viktor Mayer-Schoenberger, authors of "Big Data: A Revolution that will transform how we live, work and think" (Houghton Mifflin Harcourt, 2013), who marvel at the transition from trying to approach a mechanism in nature with smart experiments to prediciting the future behaviour of the system by merely observing and describing patterns. They call the interest in correlation (and not causation) a paradigm-shift, a revolution and nothing less than the future of science at large.
yes.
They are probably right.
And this is scary.
If you just want to get an idea of potential traffic jams depending on location and time of day, big data might help. If you need to know if your medication cures or kills, excellent statistics will do. Big data ultimately brings you from statistics of small numbers to 'N=ALL'. 
The main drive of science always was - and always will be - curiosity for the mechanism, the 'why?'.
Recording huge amounts of data - all data available - does not solve any problems. In the worst case it substitutes understanding with describing.
But in the best case, the mapping of a system on as many related data as possible can be seen as lifting it from nature to the lab. The really Big Data that contain *all* correlations would be a transposition of the real thing that then can be experimented on. See Big Data as the score-sheet to a symphony, plus information on the instruments, plus the acoustics, plus the musicians, plus the atmosphere, plus...
(I am off to the lab)

1 comment:

Sandor Ragaly said...

fantastic post - for *small* is smart (and beautiful also, partially :-) ).