Teaching uncertainty

r
teaching
uncertainty
Author

Juan Tellez

library(tidyverse)
library(socviz)
theme_nice = function() {
  theme_minimal(base_family = "Fira Sans") +
    theme(panel.grid.minor = element_blank(),
          plot.background = element_rect(fill = "white", color = NA),
          plot.title = element_text(face = "bold"),
          axis.title = element_text(face = "bold"),
          strip.text = element_text(face = "bold", size = rel(0.8), hjust = 0),
          strip.background = element_rect(fill = "grey80", color = NA),
          legend.title = element_text(face = "bold"))
}

theme_set(theme_nice())

The way that I like to teach sampling uncertainty is to say: “imagine that we’d like to know how many kids the average American has, and that there are only 2,867 people in the US and they were all perfectly sampled in {socviz::gss_sm}.

How many kids does the average American have? In this mini-America world we can find the exact answer:

gss_sm %>% 
  summarise(`Average number of kids` = mean(childs, na.rm = TRUE)) %>% 
  knitr::kable(digits = 2, align = "c")
Average number of kids
1.85
tibble(reps = 1:500) %>% 
  mutate(samples = map(reps, ~ sample_n(starwars, size = 10, replace = FALSE))) %>% 
  unnest(samples) %>% 
  group_by(reps) %>% 
  summarise(mass = mean(mass, na.rm = TRUE))
# A tibble: 500 × 2
    reps  mass
   <int> <dbl>
 1     1  72.5
 2     2 103. 
 3     3  65.3
 4     4  53.8
 5     5  77  
 6     6  84.9
 7     7  66.6
 8     8  80.2
 9     9 335. 
10    10  70.4
# … with 490 more rows