Visualizing Smoking Trends in the US

Document Type


Lead Author Type

MBI Masters Student


Dr. Guenter Tusch, tuschg@gvsu.edu

Embargo Period



Introduction: The main aim of my project is to create a visualization of behavioral use of tobacco in the United States of America in the past 20 years and how it is trending in the current population in the US. This was done by taking the data from past years on four levels of use of tobacco in the US, that is, who smoked regularly, who smoked occasionally, who never smoked and who quit smoking after a period of time.

Methods: My approach is to use data from each state of the United States from 1995-2010 and use two different visualization approaches: Tableau software and programming in the Python language to have an explicit view of the use of tobacco among the population of the United States. The data is extracted from data.cdc.gov, and the data that focus on males, females and youth usage of tobacco is extracted from http://www.americashealthrankings.org.

Results: Tobacco use can be in any form of use like use of cigarettes, cigars, small cigars or pipe smoking. This use varies in different race/ethnicity, education, age, gender, education etc.as well as the United States, hence it varies depending on sociodemographic factors too. The data results show that there is increased prevalence and trends in the tobacco use in the US. The results show that currently the woman have a high percentage of smoking in the US, followed by the youth in most of the states when compared to the males. The results show that there is a necessity of increasing the evidence-based prevention methods to ultimately decrease the increasing effects of tobacco and its related diseases on the population of the US.

Conclusion: The survey has showed that there is reduction of life expectancy by 10 years in people who smoke tobacco compared to people who never smoked in the United states. The survey results also show that people who quit smoking at an average of 35-40 years of age, reduces the risk of being affected with the fatal diseases by 90%. It also showed that rural population use tobacco higher than the urban populations.

Furthermore, we investigated how effective these three visualization tools were for specific variables. Geospatial visualization had most interactive features and was most user friendly in Tableau when compared to Python and R visualizations. Python plotting for different levels of smoking is interactive along with geospatial visualization and the user needs to create his own code for the visualization. R visualizations are static and the script needs to be documented. I conclude that for specific and fewer variables the best interactive visualization can be done using Tableau.

This document is currently not available here.