library(tidyverse)
library(socviz)
= function() {
theme_nice theme_minimal(base_family = "Fira Sans") +
theme(panel.grid.minor = element_blank(),
plot.background = element_rect(fill = "white", color = NA),
plot.title = element_text(face = "bold"),
axis.title = element_text(face = "bold"),
strip.text = element_text(face = "bold", size = rel(0.8), hjust = 0),
strip.background = element_rect(fill = "grey80", color = NA),
legend.title = element_text(face = "bold"))
}
theme_set(theme_nice())
Teaching uncertainty
r
teaching
uncertainty
The way that I like to teach sampling uncertainty is to say: “imagine that we’d like to know how many kids the average American has, and that there are only 2,867 people in the US and they were all perfectly sampled in {socviz::gss_sm}
.
How many kids does the average American have? In this mini-America world we can find the exact answer:
%>%
gss_sm summarise(`Average number of kids` = mean(childs, na.rm = TRUE)) %>%
::kable(digits = 2, align = "c") knitr
Average number of kids |
---|
1.85 |
tibble(reps = 1:500) %>%
mutate(samples = map(reps, ~ sample_n(starwars, size = 10, replace = FALSE))) %>%
unnest(samples) %>%
group_by(reps) %>%
summarise(mass = mean(mass, na.rm = TRUE))
# A tibble: 500 × 2
reps mass
<int> <dbl>
1 1 72.5
2 2 103.
3 3 65.3
4 4 53.8
5 5 77
6 6 84.9
7 7 66.6
8 8 80.2
9 9 335.
10 10 70.4
# … with 490 more rows