---
title: "MRM walkthrough on OTIS provincial data"
output:
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{MRM walkthrough on OTIS provincial data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  eval     = requireNamespace("morie", quietly = TRUE)
)
```

# What this vignette covers

The **MRM (Multilevel Reconciliation Methodology)** framework is a coordinated set
of ten causal estimators paired with a multi-source data layer for
Canadian carceral, police, and oversight data. This vignette uses the
provincial Offender Tracking Information System (OTIS; published by
the Ontario Ministry of the Solicitor General)
restrictive-confinement microdata as an example, applies the
ten-estimator ensemble to a binary-treatment design on dataset `a01`,
and shows how to read the resulting summary.

The mathematical foundations are developed in the companion paper
(Ruhela 2026, *The MRM Framework*; see `citation("morie")` for the
full bibentry).

# Loading OTIS

OTIS is shipped with the package; the `morie_load_dataset()` loader
hides the SQLite-backed indirection.

```{r load-otis, eval = FALSE}
library(morie)
otis <- morie_load_dataset("otis-2025-a01")
str(otis)
```

# The canonical a01 design

For dataset `a01` the canonical formulation is
`T_high_ac` (a binary treatment derived from administrative-classification
flags) on `Y_vm_count` (a count of a specific in-confinement
observation) with the standard demographic covariate set. This is the
design choice that the per-row MRM modules implement.

```{r design, eval = FALSE}
# Full ten-estimator ensemble on the canonical a01 design:
result <- morie_estimate_ate(
  data       = otis,
  outcome    = "Y_vm_count",
  treatment  = "T_high_ac",
  covariates = c("age", "sex", "region", "fiscal_year")
)
print(result)
```

The returned object summarises the IPW (Hajek), AIPW
(Robins--Rotnitzky--Zhao), g-computation, propensity-score-matching
(1:1 NN and five-strata subclass), IRM-DML
(Chernozhukov *et al.* 2018), PLR-DML, and SuperLearner-stacked AIPW
estimates. Multi-SE comparison (pooled, cluster on fiscal year, cluster
on individual ID, two-way) is reported alongside the IRM-DML primary.

# Augmented IPW

```{r aipw, eval = FALSE}
result_aipw <- morie_estimate_aipw(
  data       = otis,
  outcome    = "Y_vm_count",
  treatment  = "T_high_ac",
  covariates = c("age", "sex", "region", "fiscal_year")
)
print(result_aipw)
```

# Aggregate companion: incidence-rate ratios

For aggregate (year-level) outcomes the analog is a Poisson or
negative-binomial GLM with cluster-robust standard errors. The MRM
framework reports both the per-row individual-level estimate (above)
and the aggregate IRR family in parallel; see the companion paper for
the formal aggregate-IRR notation.

# Mandela classification

A separate Mandela-Rules classifier (UN Mandela Rules 43 and 44) is
applied at both the federal and provincial levels. The provincial
implementation uses a duration-only proxy that is documented
explicitly in the framework paper. Federal counterpart analyses (SIU
IAP, Sprott--Doob--Iftene) live in the companion Python module
`morie.tps_csi` and `morie.siu_iap`.

# Where to go next

- The full MRM framework paper, including all ten estimators,
  multi-SE comparison, propensity calibration, and the
  Sprott--Doob--Iftene replication tables, is in the companion
  publication set (see `citation("morie")`).
- The MORIE package paper covers the broader toolkit; see
  `citation("morie")` for the bibentry.
- Citation: see `citation("morie")`.