CPADS canonicalization and analysis

Overview

The Canadian Postsecondary Education Alcohol and Drug Use Survey (CPADS) is one of the Statistics Canada PUMFs that MORIE supports out of the box. Variable names, value codes, and survey weights differ across cycles, so MORIE provides a canonical column contract and a morie_canonicalize_cpads_data() helper to harmonise cycles into a single analysis-ready tibble.

The CPADS column contract

library(rmorie)
contract <- morie_cpads_contract()
str(contract, max.level = 2)

morie_cpads_contract() returns the canonical names, value-code maps, and survey-weight columns. Using the contract is opt-in — the estimators in MORIE do not require it — but it lets you write analysis code once and run it across cycles unchanged.

Loading + canonicalising

raw_2122 <- morie_load_dataset("cpads-2122")
df       <- morie_canonicalize_cpads_data(raw_2122)

# Validates that all canonical columns are present + correctly
# typed. Returns silently if OK, or stops with a clear message
# pointing at the offending column.
morie_validate_cpads_data(df)

A simple analysis with weights

# CPADS ships PUMF weights in a column the contract surfaces.
weighted_freq <- mean(df$heavy_drinking_30d * df$pumf_weight,
                      na.rm = TRUE)
weighted_freq

Survey-weighted causal estimate

# Estimate ATE of (canonical-treatment) on
# (canonical-outcome), passing CPADS PUMF weights:
ate <- morie_estimate_ate(df,
                    outcome    = "heavy_drinking_30d",
                    treatment  = "treat_canonical",
                    covariates = c("age", "sex", "region"),
                    weights    = "pumf_weight")
ate$estimate

Where to go next

  • The survey-weighted vignette covers complex-sample sampling (stratified, cluster, PPS), bootstrap CIs, and design effects.
  • The causal-inference vignette covers the full ATE / ATT / ATC / AIPW / CATE / GATE estimator family.
  • For Statistics Canada citation requirements, see the README’s data-acknowledgment block.