Directed reading: Data visualization

Taught: F23

Graduate
Author

Juan Tellez

Published

February 2, 2024

Overview

The goal of this directed reading is to get better at data visualization. We will:

  • Read books on effective data visualization
  • Practice making good data visualization in R
  • Meet on Mondays at 9am in my office to discuss

Format

Every week you will:

  1. Read about data visualization
  2. Complete visualization tasks in Quarto notebooks (ahead of our meeting)
  3. Meet with me for 60ish minutes to discuss / present

What you will need

  • The books below
  • R and RStudio installed and updated to latest version
  • Git installed and a Github account

Books

  • Alberto Cairo The Truthful Art: Data, Charts, and Maps for Communication (New Riders, 2016). (I have a copy)
  • Claus O. Wilke Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures (O’Reilly Media, 2019). (free online)
  • Kieran Healy Data Visualization: A Practical Introduction (Princeton University Press, 2018). (free online)

Deliverables

  • Each week you will turn in a Quarto notebook with your notes / visualizations
  • A final code-thru analysis of a dataset from here (or whatever else you want): Full Archive

Schedule

W1: Oct2 - Orientation

Reading:

  • Ch1 in Healy, Data Visualization.

Tasks:

  • Make sure everything is installed and you access to all the materials

W2: Oct9 - Grammar of graphics

Reading:

  • Ch 2 in Wilke, Fundamentals of Data Visualization.
  • Ch 5 in Cairo, The Truthful Art.
  • Ch 3 in Healy, Data Visualization.

Tasks:

  • make a plot that uses the x-axis, y-axis, and at least two other aesthetics, axis labels, title, and a theme
  • make the plot painfully slowly, iterating through each addition

W3: Oct16 - Amounts and proportions

Reading:

  • Ch 6, 10, 11 in Wilke, Fundamentals of Data Visualization.

Tasks:

  • make a barplot, stacked barplot, heat map, nested area plot

W4: Oct23 - Distributions

Tasks:

  • make a histogram, density graph, grouped version of each (e.g, using fill), boxplot, violin plot

W5: Oct30 - Relationships

  • Ch 12, 13 in Wilke, Fundamentals of Data Visualization.

Tasks: make scatterplot, correlogram, time series, multiple time series

W6: Nov6 - Model output

  • Ch 6 in Healy, Data Visualization.

Tasks: make scatterplot with smoothing line, coefficient plot (pointrange), marginal effect plots, regression table with 3 models using modelsummary package

W7: Nov13 - Annotations

  • Ch 17, 18, 22 in Wilke, Fundamentals of Data Visualization.
  • Ch 5 in Healy, Data Visualization.

Tasks: make a jitter plot, scatterplot with labeled points, barplot with text over the bars, use facet_wrap in a plot, make all plots with title, subtitle, caption, and every aesthetic has a title

W8: Nov20 - Maps

Tasks: make a country level world map or region map where countries filled by some variable using rnaturalearth, make a county-level chloropleth map using county_data from socviz

W9: Nov27 - Themes, color

  • Ch 4, 19, 22 in Wilke, Fundamentals of Data Visualization.
  • Ch 8 in Healy, Data Visualization.

Tasks: make a plot using a non-default color or fill scale, make one using a sequential scale, one using a discrete scale, a plot where you use the fact that geometries can have local aesthetics to highlight specific points or bars

W10: Dec4 - Make own website

  • Make sure Quarto is installed on your computer (quarto --version in the Terminal should return a version)
  • If not, download and install: https://quarto.org/docs/download/
  • Create a user account on Netlify

Final Assignment

A final code-thru analysis of a dataset from here (or whatever else you want): Full Archive where you use at least three plots to tell some story about the data or datasets. This should be a PDF with the images embedded and text to explain them and your overall approach. Due December 10th at midnight.