Are data journalists the new bloggers?

How do open source data analy­sis and visu­al­iza­tions by indi­vid­u­als com­pare to big data projects by data sci­en­tists using cut­ting edge soft­ware and infra­struc­ture? Some clue may lie in look­ing at the impact of blog­ging on pro­fes­sional journalism.

A few years ago I worked for Lycos and helped launch a blog­ging tool. Blog­ging was pretty new then and Blog­ger, Type­pad, and other tools were just get­ting off the ground. These new tools were pri­mar­ily aimed at con­sumers who posted pic­tures, wrote in a diary, or even shared home improvmeent projects. There were sharp dis­tinc­tions between “jour­nal­ists” who were cre­den­tialed, had fact check­ing, and an edi­to­r­ial process vs. “blog­gers” who could write about any topic and self-​edit. In ency­clo­pe­dias, you had sim­i­lar dis­tinc­tions between “real” ency­clo­pe­dias like the Ency­clopae­dia Brit­tan­ica with exten­sive edi­to­r­ial processes and open source ref­er­ence sources like Wikipedia with open review processes.

Over time, those dis­tinc­tions between “blog­ger” and “jour­nal­ist” has become less clear, and even some­thing as big an under­tak­ing as an ency­clo­pe­dia can be democratized.

Can this hap­pen to data analy­sis? It has a lot in com­mon with a large edi­to­r­ial enter­prise — was done by spe­cial­ized writ­ers with edi­to­r­ial con­trol, and now a per­son can self-​edit and self-​publish using free or low cost tools. While there always will be a high end (data sci­en­tists work­ing on big data), given the freely avail­able sources of data and low cost tools like Excel, Pow­er­pivot, Many Eyes, Tableau Pub­lic, it’s pos­si­ble for any­one to ana­lyze mil­lions of records of data and visu­al­ize them. A recent arti­cle in the Guardian, Data Jour­nal­ism is the New Punk argues that data jour­nal­ism is such that every­one can do it.

The results will prob­a­bly be sim­i­lar to blog­ging — some analy­sis spec­tac­u­larly good, deep, and insight­ful, some very wrong. The debates on whether it should be so read­ily avail­able and impact on “pro­fes­sional work: is will con­tinue as have with blog­ging and open source encyclopedias.

