======================================================================================================================= DOCUMENTATION FOR US_VEP_TURNOUT_RATES Version: 2.0 Date: 5/22/2025 Authors and Licenses: These data are derived from the Census Bureau's Current Population Survey Voting and Registration Supplement (CPS) available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html Theses survey data are reweighted using a procedure proposed by Hur and Achen (2013), to reweight the CPS state level turnout rates to equal the voting-eligible population turnout rates, available at: https://academic.oup.com/poq/article/77/4/985/1843466 These weight-adjusted turnout rates are released under an Creative Commons 4.0 Attribution License. See: https://creativecommons.org/licenses/by/4.0/ ======================================================================================================================= ============ VARIABLES ============ ============ YEAR ============ Year of election ============ CPS_ADJ_NHWHITE_RATE, CPS_ADJ_NHBLACK_RATE, CPS_ADJ_HISPANIC_RATE, CPS_ADJ_OTHER_RATE ============ Race and Hispanic ethnicity turnout RATEs calculated from various Current Population Survey Voting and Registration Supplements (CPS) using PTDTRACE (race) and PEHSPNON (Hispanic ethnicity), available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html Theses survey data are ADJusted or re-weighted using a procedure proposed by Hur and Achen (2013), to reweight the CPS state level turnout rates to equal the voting-eligible population turnout rates, described at: https://academic.oup.com/poq/article/77/4/985/1843466 NHWHITE = Non-Hispanic White NHBLACK = Non-Hispanic Black HISPANIC = Hispanic OTHER = All other non-Hispanic races (calculated this way primarily due to small sample-sizes) ============ CPS_ADJ_NHWHITE_SHARE, CPS_ADJ_NHBLACK_SHARE, CPS_ADJ_HISPANIC_SHARE, CPS_ADJ_OTHER_SHARE ============ Same as prior, except weight-adjusted SHARE of the electorate instead of turnout rate. ============ CPS_UNADJ_NHWHITE_RATE, CPS_UNADJ_NHBLACK_RATE, CPS_UNADJ_HISPANIC_RATE, CPS_UNADJ_OTHER_RATE ============ Same as prior turnout RATE, but using the UNADJusted Census Bureau weight (usually PWSSWGT, but may vary). These statistics are provided primarily as a verification check against Census Bureau reports available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html ============ CPS_UNADJ_NHWHITE_SHARE, CPS_UNADJ_NHBLACK_SHARE, CPS_UNADJ_HISPANIC_SHARE, CPS_UNADJ_OTHER_SHARE ============ Same as prior, except unadjusted weight SHARE of the electorate instead of turnout rate. ============ CPS_ADJ_AGE1829_RATE, CPS_ADJ_AGE3044_RATE, CPS_ADJ_AGE4559_RATE, CPS_ADJ_AGE60_RATE ============ Age turnout RATEs calculated from various Current Population Survey Voting and Registration Supplements (CPS) using PRTAGE (age), available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html Theses survey data are ADJusted or reweigted using a procedure proposed by Hur and Achen (2013), to reweight the CPS state level turnout rates to equal the voting-eligible population turnout rates, described at: https://academic.oup.com/poq/article/77/4/985/1843466 AGE1829 = Age 18 to 29 AGE3044 = Age 30 to 44 AGE4559 = Age 45 to 59 AGE60 = Age 60+ ============ CPS_ADJ_AGE1829_SHARE, CPS_ADJ_AGE3044_SHARE, CPS_ADJ_AGE4559_SHARE, CPS_ADJ_AGE60_SHARE ============ Same as prior, except weight-adjusted SHARE of the electorate instead of turnout rate. ============ CPS_UNADJ_AGE1829_RATE, CPS_UNADJ_AGE3044_RATE, CPS_UNADJ_AGE4559_RATE, CPS_ADJ_UNAGE60_RATE ============ Same as prior turnout RATE, but using the UNADJusted Census Bureau weight (usually PWSSWGT, but may vary). These statistics are provided primarily as a verification check against Census Bureauy reports available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html ============ CPS_ADJ_UNAGE1829_SHARE, CPS_ADJ_UNAGE3044_SHARE, CPS_ADJ_UNAGE4559_SHARE, CPS_ADJ_UNAGE60_SHARE ============ Same as prior, except unadjusted weight SHARE of the electorate instead of turnout rate. ============ CPS_ADJ_NOHS_RATE, CPS_ADJ_HS_RATE, CPS_ADJ_COLLEGE_RATE, CPS_ADJ_POST_RATE ============ Education turnout RATEs calculated from various Current Population Survey Voting and Registration Supplements (CPS) using PEEDUCA (education), available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html Theses survey data are ADJusted or reweigted using a procedure proposed by Hur and Achen (2013), to reweight the CPS state level turnout rates to equal the voting-eligible population turnout rates, described at: https://academic.oup.com/poq/article/77/4/985/1843466 NOHS = Not a high school graduate HS = High school graduate COLLEGE = Any college including graduate POST = Post-undergraduate degree ============ CPS_ADJ_NOHS_SHARE, CPS_ADJ_HS_SHARE, CPS_ADJ_COLLEGE_SHARE, CPS_ADJ_POST_SHARE ============ Same as prior, except weight-adjusted SHARE of the electorate instead of turnout rate. ============ CPS_UNADJ_NOHS_RATE, CPS_UNADJ_HS_RATE, CPS_UNADJ_COLLEGE_RATE, CPS_UNADJ_POST_RATE ============ Same as prior turnout RATE, but using the UNADJusted Census Bureau weight (usually PWSSWGT, but may vary). These statistics are provided primarily as a verification check against Census Bureauy reports available at: https://www.census.gov/data/datasets/time-series/demo/cps/cps-supp_cps-repwgt/cps-voting.html ============ CPS_UNADJ_NOHS_SHARE, CPS_UNADJ_HS_SHARE, CPS_UNADJ_COLLEGE_SHARE, CPS_UNADJ_POST_SHARE ============ Same as prior, except unadjusted weight SHARE of the electorate instead of turnout rate. ============ REPLICATION R CODE ============ library(tidyverse) library(survey) turnout_2024 <- read_csv("https://election.lab.ufl.edu/data-downloads/turnoutdata/Turnout_2024G_v0.3.csv") # remove US (not necessary given join below, but good practice) and convert turnout_2024 <- turnout_2024 |> filter(STATE_ABV != "US") |> mutate(VEP_TURNOUT_RATE = as.numeric(gsub("%", "", VEP_TURNOUT_RATE)) / 100) cps_2024 <- read_csv("") fipscodes <- read_csv("") # see below cps_2024 <- left_join(cps_2024, fipscodes, by = "GESTFIPS") cps_2024 <- left_join(cps_2024, turnout_2024, by = "STATE_ABV") ################### # Demographic definitions ################### cps_2024 <- cps_2024 |> mutate(race_cat4 = case_when(PTDTRACE == 1 & PEHSPNON == 2 ~ 1, (PTDTRACE %in% c(2,6,10,11,12,16,17,18,22,23)) & PEHSPNON == 2 ~ 2, PEHSPNON == 1 ~ 3, PTDTRACE == -1 ~ NA, TRUE ~ 4)) |> mutate(age4 = case_when(PRTAGE >=18 & PRTAGE <30 ~ 0, PRTAGE >=30 & PRTAGE < 45 ~ 1, PRTAGE >=45 & PRTAGE < 60 ~ 2, PRTAGE >=60 ~3, TRUE ~ NA)) |> mutate(educ = case_when(PEEDUCA == -1 ~ NA, PEEDUCA >= 31 & PEEDUCA < 39 ~ 1, PEEDUCA == 39 ~ 2, PEEDUCA >= 40 & PEEDUCA <= 43 ~ 3, PEEDUCA >= 44 ~ 4)) # four implied decimals to CPS final weight cps_2024 <- cps_2024 |> mutate(PWSSWGT = PWSSWGT/10000) cps_2024 <- cps_2024 |> mutate(VOTE = case_when(PES1 %in% c(-3, -2, -1, 2) ~ "DID_NOT_VOTE", PES1 == 1 ~ "VOTED", TRUE ~ NA_character_)) ################### # Use Census weight (PWSSWGT) and calculate state turnout rates # Run this to verify code valdidates CPS ################### cps_2024$VOTE <- factor(cps_2024$VOTE) cps_2024$GESTFIPS <- factor(cps_2024$GESTFIPS) cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(GESTFIPS) & !is.na(PWSSWGT)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_state_turnout_rate <- svyby(~VOTE, ~GESTFIPS, design, svymean) ################### # Hur and Achen CPS weight correction ################### # recode PES1 to vote_missing such that PES1 missing data is NA, 2 is did not vote, 1 is voted cps_2024 <- cps_2024 |> mutate(vote_missing = case_when(PES1 %in% c(-3, -2, -1) ~ NA, PES1 == 1 ~ "VOTED", PES1 == 2 ~ "DID_NOT_VOTE", TRUE ~ NA)) # get CPS state turnout rates excluding missing data and using CPS weight PWSSWGT cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(GESTFIPS) & !is.na(PWSSWGT)) # set factors for survey calculations cps_2024_clean$vote_missing <- factor(cps_2024_clean$vote_missing) cps_2024_clean$GESTFIPS <- factor(cps_2024_clean$GESTFIPS) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_state_turnout_rate <- svyby(~vote_missing, ~GESTFIPS, design, svymean) cps_state_turnout_rate <- cps_state_turnout_rate |> mutate(GESTFIPS = as.numeric(as.character(GESTFIPS))) # merge CPS state turnout rates back to CPS. Need for re-weight calculation cps_2024 <- left_join(cps_2024, cps_state_turnout_rate, by = "GESTFIPS") # calculate Hur and Achen weights cps_2024_reweighted <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(GESTFIPS) & !is.na(PWSSWGT)) |> mutate(weight_correction = case_when(vote_missing == "VOTED" ~ VEP_TURNOUT_RATE/vote_missingVOTED, vote_missing == "DID_NOT_VOTE" ~ (1-VEP_TURNOUT_RATE)/(1-vote_missingVOTED))) |> mutate(weight_correction = weight_correction * PWSSWGT) # set factors for survey calculations cps_2024_reweighted$vote_missing <- factor(cps_2024_clean$vote_missing) cps_2024_reweighted$GESTFIPS <- factor(cps_2024_clean$GESTFIPS) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted) cps_state_turnout_rate_reweighted <- svyby(~vote_missing, ~GESTFIPS, design, svymean) ################### # Demographic calculations ################### # Race ################### # Hur and Achen weights # turnout rate cps_2024_reweighted_clean <- cps_2024_reweighted |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(weight_correction)) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted_clean) cps_race_rate_reweighted <- svyby(~vote_missing, ~race_cat4, design, svymean) # share of electorate cps_2024_reweighted_clean <- cps_2024_reweighted |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(weight_correction)) |> filter(!is.na(race_cat4)) |> filter(vote_missing == "VOTED") |> mutate(race_cat4 = as.factor(race_cat4)) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted_clean) cps_race_share_reweighted <- as.data.frame(prop.table(svytable(~factor(race_cat4), design))) # Census weights # turnout rate cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(PWSSWGT)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_race_rate <- svyby(~VOTE, ~race_cat4, design, svymean) # share of electorate cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(PWSSWGT)) |> filter(!is.na(race_cat4)) |> filter(VOTE == "VOTED") |> mutate(race_cat4 = as.factor(race_cat4)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_race_share <- as.data.frame(prop.table(svytable(~factor(race_cat4), design))) # Age ################### # Hur and Achen weights # turnout rate cps_2024_reweighted_clean <- cps_2024_reweighted |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(weight_correction)) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted_clean) cps_age_rate_reweighted <- svyby(~vote_missing, ~age4, design, svymean) # share of electorate cps_2024_reweighted_clean <- cps_2024_reweighted |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(weight_correction)) |> filter(!is.na(age4)) |> filter(vote_missing == "VOTED") |> mutate(age4 = as.factor(age4)) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted_clean) cps_age_share_reweighted <- as.data.frame(prop.table(svytable(~factor(age4), design))) # Census weights # turnout rate cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(PWSSWGT)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_age_rate <- svyby(~VOTE, ~age4, design, svymean) # share of electorate cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(PWSSWGT)) |> filter(!is.na(age4)) |> filter(VOTE == "VOTED") |> mutate(age4 = as.factor(age4)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_age_share <- as.data.frame(prop.table(svytable(~factor(age4), design))) # Education ################### # Hur and Achen weights # turnout rate cps_2024_reweighted_clean <- cps_2024_reweighted |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(weight_correction)) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted_clean) cps_educ_rate_reweighted <- svyby(~vote_missing, ~educ, design, svymean) # share of electorate cps_2024_reweighted_clean <- cps_2024_reweighted |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(vote_missing) & !is.na(weight_correction)) |> filter(!is.na(educ)) |> filter(vote_missing == "VOTED") |> mutate(age4 = as.factor(educ)) design <- svydesign(ids = ~1, weights = ~weight_correction, data = cps_2024_reweighted_clean) cps_educ_share_reweighted <- as.data.frame(prop.table(svytable(~factor(educ), design))) # Census weights # turnout rate cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(PWSSWGT)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_educ_rate <- svyby(~VOTE, ~educ, design, svymean) # share of electorate cps_2024_clean <- cps_2024 |> filter(PRTAGE >= 18 & PRCITSHP %in% c(1,2,3,4)) |> # this filter not necessary, but keep for good practice filter(!is.na(VOTE) & !is.na(PWSSWGT)) |> filter(!is.na(educ)) |> filter(VOTE == "VOTED") |> mutate(educ = as.factor(educ)) design <- svydesign(ids = ~1, weights = ~PWSSWGT, data = cps_2024_clean) cps_educ_share <- as.data.frame(prop.table(svytable(~factor(educ), design))) STATE ABBREVIAIONS to FIPSCODES ============ Recommend cutting and pasting this table into a text file and reading it in as a tab delimited file STATE_ABV GESTFIPS AL 1 AK 2 AZ 4 AR 5 CA 6 CO 8 CT 9 DE 10 DC 11 FL 12 GA 13 HI 15 ID 16 IL 17 IN 18 IA 19 KS 20 KY 21 LA 22 ME 23 MD 24 MA 25 MI 26 MN 27 MS 28 MO 29 MT 30 NE 31 NV 32 NH 33 NJ 34 NM 35 NY 36 NC 37 ND 38 OH 39 OK 40 OR 41 PA 42 RI 44 SC 45 SD 46 TN 47 TX 48 UT 49 VT 50 VA 51 WA 53 WV 54 WI 55 WY 56