Skip to main content

Justin Bieber falsely correlates with Influenza

Just now we got aware of a scientific paper by Aron Culotta (2010) evaluating data from The U.S. Centers for Disease Control and Prevention (CDC) on Influenza Like Illnesses (ILI) and specific influenza-related keyphrases on twitter (flu, cough, headache, sore throat...). The correlation of twitter-based predictions of ILI-devlopment (after a training-phase to optimize the algorithm) with real data is amazing, giving proof to the concept of data-mining from social-media streams. While for a variety of analyzed phrases the results were comparably good, there is a word of caution from the authors These results show extremely strong correlations for all queries except for fever, which appears frequently in ļ¬gurative phrases such as “I’ve got Bieber fever”.
Besides the beauty of the demonstrated algorithms the paper gives a helpful overview of fundamental literature in this young field.

Comments

Popular posts from this blog

Academics should be blogging? No.

"blogging is quite simply, one of the most important things that an academic should be doing right now" The London School of Economics and Political Science states in one of their, yes, Blogs . It is wrong. The arguments just seem so right: "faster communication of scientific results", "rapid interaction with colleagues" "responsibility to give back results to the public". All nice, all cuddly and warm, all good. But wrong. It might be true for scientoid babble. But this is not how science works.  Scientists usually follow scientific methods to obtain results. They devise, for example, experiments to measure a quantity while keeping the boundary-conditions in a defined range. They do discuss their aims, problems, techniques, preliminary results with colleagues - they talk about deviations and errors, successes and failures. But they don't do that wikipedia-style by asking anybody for an opinion . Scientific discussion needs a set

Information obesity? Don't swallow it!

Great - now they call it 'information obesity'! If you can name it, you know it. My favourite source of intellectual shallowness, bighthink.com, again wraps a whiff of nothing into a lengthy video-message. As if seeing a person read a text that barely covers up it's own emptyness makes it more valuable. More expensive to produce, sure. But valuable? It is ok, that Clay Johnson does everything to sell his book. But (why) is it necessary to waste so many words, spoken or written, to debate a perceived information overflow? Is it fighting fire with fire? It is cute to pack the problem of distractions into the metaphore of 'obesity', 'diet' and so on. But the solution is the same. At the core of every diet you have 'burn more than you eat'. If you cross a street, you don't read every licence-plate, you don't talk to everybody you encounter, you don't count the number of windows of the houses across, you don't interpret the sounds an

Driven by rotten Dinosaurs

My son is 15 years old. He asked me what a FAX-machine was. He get's the strange concept of CDs because there is a rack full with them next to the bookshelf, which contains tons of paper bound together in colorful bundles, called 'books'. He still accepts that some screens don't react to you punching your fingers on them. He repeatedly asks why my 'car' (he speaks the quotation marks) is powered by 'rotten dinosaurs'. At the same time he writes an email to Elon Musks Neuralink asking for an apprenticeship and sets up discord-servers for don't-ask-me-what. And slowly I am learning that it is a very good thing to be detached from historic technology, as you don't try to preserve an outdated concept while aiming to innovate. The optimized light-bulb would be an a wee bit more efficient, tiny light-bulb. But not a LED. An optimized FAX would probably handle paper differently - it would not be a file-transfer-system. Hyper-modern CDs might have tenf