IPW deep-dive (Hajek and Horvitz–Thompson)

Overview

Inverse-probability weighting (IPW) is the simplest of the single-robust causal estimators. This vignette shows the building blocks that MORIE exposes: the Horvitz–Thompson and the Hajek-stabilised IPW estimators, propensity-score modelling, and weight-trimming diagnostics.

Setting up

library(rmorie)
set.seed(2026)
n <- 500
X1 <- rnorm(n)
X2 <- rnorm(n)
ps_true <- plogis(0.4 * X1 - 0.3 * X2)
treat   <- as.integer(ps_true > runif(n))
y       <- 1.0 * treat + 0.6 * X1 - 0.2 * X2 + rnorm(n, sd = 0.5)
df <- data.frame(y = y, treat = treat, X1 = X1, X2 = X2)

Estimating propensities

The morie_estimate_ate() machinery fits a logistic propensity model internally and returns the IPW estimate by default. To inspect the propensities, set propensity_col after fitting them:

ps_fit <- glm(treat ~ X1 + X2, family = binomial(), data = df)
df$ps  <- predict(ps_fit, type = "response")
summary(df$ps)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>  0.1604  0.3727  0.4549  0.4580  0.5374  0.7422

Hajek-stabilised IPW

morie_estimate_ate() defaults to the Hajek estimator, which divides each weighted sum by the corresponding sum of weights. This stabilises the estimator under finite samples even when the propensity tails are heavy:

ate_hajek <- morie_estimate_ate(df, treatment = "treat", outcome = "y", covariates = c("X1", "X2"),
                          propensity_col = "ps")
ate_hajek$estimate
#> NULL
ate_hajek$se
#> [1] 0.08807045

Weight diagnostics

In practice, IPW is sensitive to extreme propensities. Two common diagnostics:

# Effective sample size after weighting
ess <- morie_effective_sample_size(1 / df$ps)
ess
#> [1] 458.4318

# Range of weights (extreme means trimming)
range(1 / df$ps)
#> [1] 1.347363 6.234294

If the effective sample size collapses dramatically, the analysis should consider:

  • Trimming propensities to a sensible interval (e.g. [0.05, 0.95])
  • Switching to a doubly-robust estimator (morie_estimate_aipw())
  • Adding more covariates to better separate the treatment groups

AIPW for protection against IPW failure

aipw <- morie_estimate_aipw(df, treatment = "treat", outcome = "y", covariates = c("X1", "X2"))
aipw$estimate
#> NULL

When propensities are well-behaved, IPW and AIPW should agree to within Monte Carlo noise. Disagreement is informative: it suggests either model misspecification or a fragile propensity model.

Where to go next

  • The causal-inference vignette covers ATT / ATC / CATE / GATE.
  • The survey-weighted vignette covers IPW under complex-sample designs (when survey weights and propensities both apply).