Counting Every Breath
A data-visualization about lung cancer incidence, treatment and survivability

Type

Data Visualization

Tools

Tableau, Rawgraphs, Adobe Illustrator

Client

Prof. Jodie Jenkinson (MSC2023H)

Target Audience

Healthcare workers and policy-makers involved in lung cancer operations and advocacy. Families with aging individuals

Date completed

April 14, 2022

Goals

To create a data-driven visualization

Process

Visualizing lung cancer data was one of my first ideas during the beginning stages of this project. Coming off the heels of reading Paul Kalanithi’s ‘When Breath Becomes Air’ (a real tear-jerker if you’re interested), the topic of lung cancer was fresh on my mind. Anecdotally, I’ve heard a lot about individuals suddenly being diagnosed with late-stage lung cancer after developing symptoms and Paul Kalanithi was an example of this tragic situation. I was curious to see if this was true in a large scale situation and if so, how long do these people eventually have left to live?

Fortunately, the NIH’s PLCO (prostate, lung, colon, ovarian) cancer trials, in particular, were tracked meticulously and proved to be the whole foundation of this project.

Ideation and concepting

First, looking at the different variables, I created a mindmap on the relationships between them so that I could visualize how many charts I need to tell a coherent story. Then, I drew multiple sketches of what I wanted the charts to look like based on what variables I wanted to show (age, time until death, cancer stage during diagnosis, treatment type, cause of death, family history of smoking and of cancer) and how I wanted to tell that story. Check out my visual inspirations here. 

In terms of information flow, it made the most sense to take the story from diagnoses to treatment to death as it mimicked how the process would occur in a real clinical setting. By separating the stages of cancer into spatially-different bins, the data can easily be parsed and interpreted based on the cancer stage diagnosis without adding additional complexity to each data point. The alluvial diagram that follows the dot plot provides correlative data that can be easily followed as the reader tracks different trends within the data.

Cleaning the data

Using Tableau Prep and some manual labor, I filtered and cleaned through thousands of data points to create endpoints that were more manageable. After this step, the cleaned data’s constraints were: 

1. The data was drawn only from NIH’s PLCO study trials

2. The clinical patients were between the ages of 54 and 89

3. All of the data included were of individuals who have passed away

Creating charts

Using the prepped data, I utilized tableau to generate dot plots for each stage. Exporting the charts as a PDF allowed me to edit the points easily in Adobe Illustrator. Similarly, I generated the alluvial charts in Rawgraphs and the bar chart in Excel before taking them into Illustrator to edit the SVG directly. 

Colour and layout

Taking inspiration from the fleshy pink hue of lungs, I wanted to feature a pinkish shade as the main colour. By having each chart take a monochromatic look, it shows the cohesiveness of each section while the analogous colour scheme accentuates each chart’s relatedness as well as their differences. At the end, the pink/purple/brown combo that I chose proved to be a harmonious balance that frames the data nicely.

Even at the sketch phase, I knew the dot plot and the Alluvial charts had to be in a specific order. After playing with different formats, I decided to use a horizontal layout that gave the charts more space as the information is read the same way. 

Data source: NIH PLCO Dataset