Become a data journalist: The overview

Let’s start with baby steps. Think about how you can incorporate data in your story, irrespective of the subject. Data can always tell a convincing story.

To be a data journalist, you have to actually know what a data journalist does day in and day out.

There are three areas that a data journalist may work on:

  1. Data mining or data gathering: To actually collect the data to be worked on can be exhausting if you don’t know where or how to look for it. It is also like a little chicken and egg game, where we don’t know whether a journalistic question drives the journalist’s data search, or the data dump lands on his or her lap like a ‘lightning out of a clear blue sky’. The latter is rare, but that is what awoke editors and journalists across newsrooms in the world when WikiLeaks released the War Diaries in 2010.

    ‘Finding data’ can involve anything from having expert knowledge and contacts to being able to use computer assisted reporting skills or, for some, specific technical skills such as MySQL or Python to gather the data for you.

    -Paul Bradshaw ( How to be a data journalist, The Guardian, October 1, 2010 )

    Execution: Start simple. Just search online for available data in the subject you are working on. Write to authorities requesting for information that you know is meant to be public. You can also file a FOIA (Freedom of Information) or RTI (Right to Information) petition to a relevant public body. Don’t forget to talk to experts in the field, or even other journalists, who you know have written stories about the topic.


  2. Data analysis or making sense of the data set: Once you get the data, your clock starts ticking. Time is running out, so making sense of it under crazy deadlines can be a challenge. The most important for effective analysis of data is sound background knowledge. Read and research the subject as much as possible even before the data reaches you. There is no point even if you stumble upon a gold mine if you don’t know how gold looks, right? 

    The real question then is how a journalist can train their brains to look for patterns and associations under a deadline. After all, once you find data most journalists still have to clean it up so that it’s useful and that takes time. It’s also prone to accidental errors, the more time humans spend massaging it.

    – Pete Forde, Founder, BuzzData

    Execution: Learn MS Excel for starters. To make sense of a huge set of data, you need to first organize the data (a.k.a data cleaning) and make it workable. Remove duplicate rows (de-dup) and columns and merge table to create the perfect worksheet for you to play with. You can use MS Excel, R, a Relational Database Management System (RDBMS) such as MySQL/Postgres, or Python/Ruby/node.js, QGIS. There are also tools available to help data cleaning, eg. Tabula or OpenRefine.


  3. Present the data: Now that you’ve spent a good chunk of time understanding the data and see how they interact with each other, it is time you showed your readers what you discovered. Presenting the findings of a data analysis sometimes can be done using simple lists, charts, maps and graphs, or other more interactive ways such as timelines, and customized web applications built from scratch.

    Play around. If you’re good with a graphics package, try making the visualisation clearer through color and labelling. And always include a piece of text giving a link to the data and its source – because infographics tend to become separated from their original context as they make their way around the web.

    -Paul Bradshaw ( How to be a data journalist, The Guardian, October 1, 2010 )

    Execution: There are numerous tools available online to make data visualizations and infographics easier for reporters. Some of the more popular ones that newsrooms use are: Google Charts, Fusion Tables, Tableau, Timeline JS, StoryMaps JS, and Plotly. To build web applications, you need to learn a little coding; start with Javascript and HTML5, it will come in handy.

Useful links:


Author: Aparna Ghosh

There are some people who know exactly where they belong —ethnically, religiously, linguistically, professionally and culturally— and then there are some who are born as the bridge species. I put myself in that bucket, the bridge species. I was born from an inter-religious marriage, I grew up in a State in India ethnically very different from where both my parents were from, I studied and gained experience in two quite disparate professions -- computer science engineering and journalism.  I studied both fields passionately and started writing about science and technology after graduating from the Graduate School of Journalism at Columbia University. I’ve written for various technology newspapers and blogs for over 6 years now. This gives me the ability to talk about the political zeitgeist today, as easily as I can debug code.  If coding languages don't do the trick, the fact that I can construct sentences in five different languages (Hindi, Urdu, Bengali, Tamil, and French) besides English also aides conversations.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s