Skip to main content

Don't call Big Data a Revolution

Everybody in science seems to love Big Data. Put "Big Data" in your grant proposal and your file gets on top of the pile. Sure, some had the suspicion that funding for operating with big data went up because those nerds in the basement of NSA need some help sifting through cassettes of indiscriminate tapping into every utterance of every two-legged creature on earth. Those losers obviously lack the brains to ask the right questions and to target a reasonable subset of mankind - so they just grab everything they get. And stay as blind as they were before. Of course it is difficult to find a needle in a haystack - but why dump all that hay on the needle in the first place?
This aside, there are believers like Kenneth Cukier and Viktor Mayer-Schoenberger, authors of "Big Data: A Revolution that will transform how we live, work and think" (Houghton Mifflin Harcourt, 2013), who marvel at the transition from trying to approach a mechanism in nature with smart experiments to prediciting the future behaviour of the system by merely observing and describing patterns. They call the interest in correlation (and not causation) a paradigm-shift, a revolution and nothing less than the future of science at large.
yes.
They are probably right.
And this is scary.
If you just want to get an idea of potential traffic jams depending on location and time of day, big data might help. If you need to know if your medication cures or kills, excellent statistics will do. Big data ultimately brings you from statistics of small numbers to 'N=ALL'. 
The main drive of science always was - and always will be - curiosity for the mechanism, the 'why?'.
Recording huge amounts of data - all data available - does not solve any problems. In the worst case it substitutes understanding with describing.
But in the best case, the mapping of a system on as many related data as possible can be seen as lifting it from nature to the lab. The really Big Data that contain *all* correlations would be a transposition of the real thing that then can be experimented on. See Big Data as the score-sheet to a symphony, plus information on the instruments, plus the acoustics, plus the musicians, plus the atmosphere, plus...
(I am off to the lab)

Comments

Sandor Ragaly said…
fantastic post - for *small* is smart (and beautiful also, partially :-) ).

Popular posts from this blog

Academics should be blogging? No.

"blogging is quite simply, one of the most important things that an academic should be doing right now" The London School of Economics and Political Science states in one of their, yes, Blogs . It is wrong. The arguments just seem so right: "faster communication of scientific results", "rapid interaction with colleagues" "responsibility to give back results to the public". All nice, all cuddly and warm, all good. But wrong. It might be true for scientoid babble. But this is not how science works.  Scientists usually follow scientific methods to obtain results. They devise, for example, experiments to measure a quantity while keeping the boundary-conditions in a defined range. They do discuss their aims, problems, techniques, preliminary results with colleagues - they talk about deviations and errors, successes and failures. But they don't do that wikipedia-style by asking anybody for an opinion . Scientific discussion needs a set

Information obesity? Don't swallow it!

Great - now they call it 'information obesity'! If you can name it, you know it. My favourite source of intellectual shallowness, bighthink.com, again wraps a whiff of nothing into a lengthy video-message. As if seeing a person read a text that barely covers up it's own emptyness makes it more valuable. More expensive to produce, sure. But valuable? It is ok, that Clay Johnson does everything to sell his book. But (why) is it necessary to waste so many words, spoken or written, to debate a perceived information overflow? Is it fighting fire with fire? It is cute to pack the problem of distractions into the metaphore of 'obesity', 'diet' and so on. But the solution is the same. At the core of every diet you have 'burn more than you eat'. If you cross a street, you don't read every licence-plate, you don't talk to everybody you encounter, you don't count the number of windows of the houses across, you don't interpret the sounds an

Driven by rotten Dinosaurs

My son is 15 years old. He asked me what a FAX-machine was. He get's the strange concept of CDs because there is a rack full with them next to the bookshelf, which contains tons of paper bound together in colorful bundles, called 'books'. He still accepts that some screens don't react to you punching your fingers on them. He repeatedly asks why my 'car' (he speaks the quotation marks) is powered by 'rotten dinosaurs'. At the same time he writes an email to Elon Musks Neuralink asking for an apprenticeship and sets up discord-servers for don't-ask-me-what. And slowly I am learning that it is a very good thing to be detached from historic technology, as you don't try to preserve an outdated concept while aiming to innovate. The optimized light-bulb would be an a wee bit more efficient, tiny light-bulb. But not a LED. An optimized FAX would probably handle paper differently - it would not be a file-transfer-system. Hyper-modern CDs might have tenf