Getting started with MORIE in R

Overview

MORIE is a multi-domain scientific-computing toolkit with parallel Python and R packages. The R package mirrors a substantial subset of the Python package, focused on the surfaces that are most useful from within an R workflow: dataset loading, causal estimators, survey sampling and weighting, basic spectral analysis, and helpers for the MRM (McNamara–Ruhela–Medina) framework that is MORIE’s primary sociolegal-data application.

This vignette walks through a minimal end-to-end session: load the package, look at the bundled dataset catalogue, load one dataset, and run an average-treatment-effect estimator on a small synthetic example. A second vignette (mrm-otis-walkthrough) covers the MRM ten-estimator ensemble on OTIS provincial data.

Loading the package

library(rmorie)

The dataset catalogue

morie_dataset_catalog() returns a data frame summarising every dataset bundled with the package or accessible via the package’s loaders. This is the easiest way to discover what’s available without leaving the R session.

catalog <- morie_dataset_catalog()
head(catalog)

For details on a single dataset (variables, source, citation), use morie_dataset_info():

morie_dataset_info("cpads-2122")

Loading a dataset

morie_load_dataset() returns a tibble (or data frame) for any dataset in the catalogue. Public-use datasets that ship inside the package require no further configuration; for datasets backed by remote SQLite mirrors, configure MORIE_LOCAL_DB_DIR (local directory of .sqlite files) or MORIE_REMOTE_URL (HTTP endpoint).

df <- morie_load_dataset("cpads-2122")
dim(df)

A simple ATE estimate

For users who already have a treatment / outcome / covariate dataset in hand, the estimators are designed to work on any tibble or data frame — there is no hard-coded column-name convention. The example below is fully synthetic and runnable without any external data.

set.seed(2026)
n <- 500
X1 <- rnorm(n)
X2 <- rnorm(n)
# Confounded treatment assignment.
treat <- as.integer(plogis(0.5 * X1 - 0.3 * X2) > runif(n))
# Outcome with a true ATE of +1.0 plus covariate effects.
y <- 1.0 * treat + 0.7 * X1 - 0.2 * X2 + rnorm(n, sd = 0.5)

df_synth <- data.frame(y = y, treat = treat, X1 = X1, X2 = X2)
result <- morie_estimate_ate(
  data       = df_synth,
  outcome    = "y",
  treatment  = "treat",
  covariates = c("X1", "X2")
)
print(result)
#> $ate
#> [1] 0.9526445
#> 
#> $se
#> [1] 0.09478822
#> 
#> $ci_lower
#> [1] 0.7668596
#> 
#> $ci_upper
#> [1] 1.138429
#> 
#> $n
#> [1] 500
#> 
#> $ess
#> [1] 462.4943

The returned object is a list with the point estimate, standard error, confidence interval, and the underlying nuisance fits, in the RichResult-compatible structure described in the Python package paper.

Companion estimators

morie_estimate_att(), morie_estimate_atc(), and morie_estimate_aipw() follow the same calling convention. The augmented IPW estimator (morie_estimate_aipw()) is doubly robust under correct specification of either the propensity model or the outcome model.

result_aipw <- morie_estimate_aipw(
  data       = df_synth,
  outcome    = "y",
  treatment  = "treat",
  covariates = c("X1", "X2")
)
print(result_aipw)

Where to go next

  • The mrm-otis-walkthrough vignette demonstrates the ten-estimator MRM ensemble on Ontario OTIS provincial restrictive-confinement microdata.
  • The MORIE package paper describes the wider scope of the toolkit beyond R: signal processing, cryptography, spatial statistics, statistical-physics-of-crime models, psychometrics, and the full Python interface.
  • Citation: see citation("morie").