Practical Tips For Working On A Data-Driven Report
How can you tell a story using data?
As a data analyst, one is constantly talking to an audience that is short on time, does not like complexity and prefers opening up for Q&A half way through the cover slide! My experience working with fragmented and large data sets have taught me a few things:
Give some thought to how the data should be extracted: You cannot build an insightful visualization unless you have the data behind it in the right shape. Whenever possible, extract the data in the format closest to the desired format for analysis. A simple arrangement of rows and columns during extraction can often save much time and heart-burn in reshaping / cleaning the data to a form suitable for graphing. In certain cases, the front end of the data warehouse may not have enough flexibility for extracting data – in such cases, try to get access to the location where the data actually lives behind the front end- chances are it will be an SQL / db2 database. You will discover a treasure trove of information which hides behind the mask. (Beware: getting access might be a long term battle – not one to be fought with a deadline lurking around the corner.)
Marry the agenda with the data-preparation: Analysts are often torn between two choices (a) drilling down deeper and deeper or (b) sticking to a script and timelines. There are obvious merits and traps in both approaches and therefore finding a balance is important. To achieve this balance, even before starting the data cleaning, I design a flexible agenda for the report with branches / sub points. After this, I map each section of the agenda to the exact format in which I need the data to perform the analysis. This makes the data cleaning process more focused and efficient. For e.g. the overview section of the agenda may require data on an hourly level time-series across 6 metrics.