--- title: "MRM empirical callables (OTIS / TPS / SIU)" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{MRM empirical callables (OTIS / TPS / SIU)} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = requireNamespace("morie", quietly = TRUE) ) ``` # Overview This vignette documents the `mrm_otis_*()`, `mrm_tps_*()`, and `mrm_siu_*()` empirical callables. Each function is a one-line entry point to a verified analysis used in the *MRM empirical paper* (Ruhela 2026, in preparation). Every example below runs on the small reference samples bundled with the package, so the vignette is network-free. For the full datasets: * OTIS \u2192 `morie_load_dataset("otisb01")` (downloads via CKAN on first call; subsequent calls hit the local SQLite cache) * TPS \u2192 `morie_fetch_tps("Assault")` (ArcGIS REST) * SIU \u2192 `morie_fetch_siu()` (on-demand scrape of public reports) See `vignette("mrm-dataset-fetchers")` for the dataset side. ```{r load} library(morie) b01 <- morie_sample("otis_b01") b09 <- morie_sample("otis_b09") tps <- morie_sample("tps_assault") ``` # OTIS suite ## Placement-count concentration on `b09` The b09 long-format file publishes per (fiscal year \u00d7 placement-count band \u00d7 gender) counts of individuals in segregation. The callable expands the banded counts using midpoints and returns Hill-MLE Pareto exponent, Gini coefficient, mean placements per individual, and the top-k% concentration share. ```{r b09} mrm_otis_placement_concentration(b09) ``` The values are computed *within fiscal year*: OTIS `UniqueIndividual_ID` has format `YYYY-XXXXX-SG` and is randomly reassigned every fiscal year, so cross-year tracking is invalid by design. ## Segregation-duration KM on `b01` `NumberConsecutiveDays_Segregation` is the duration in days of each placement (no censoring \u2014 all durations are observed). The callable reports the per-stratum mean, median, q25, and the fraction above the UN Mandela 15-day cutoff. ```{r b01-duration} mrm_otis_seg_duration_km(b01) mrm_otis_seg_duration_km(b01, group_cols = "MentalHealth_Alert") ``` This callable replaces the misreading of `YYYY-XXXXX-SG` as a persistent person identifier, which produces a spurious cross-year "time-to-readmission" artifact. ## Mortification co-occurrence (alert columns) The three b01 alert flags (`MentalHealth_Alert`, `SuicideRisk_Alert`, `SuicideWatch_Alert`) co-occur to a degree well above independence. The substantive figure is `MentalHealth \u00d7 SuicideRisk` Cramer's V. ```{r mortification} mrm_otis_mortification_cooccurrence(b01) ``` ## Region locality Ontario provincial seg/RC placement is overwhelmingly locality-preserving \u2014 over 95% of placements remain within the same region in the full b01. ```{r region, eval = FALSE} # (Region columns are present only in the full b01, not the bundled # sample; uncomment after morie_load_dataset("otisb01") or # morie_fetch_tps(...) if needed.) res <- mrm_otis_region_locality(b01) print(res$table) cat("diagonal share:", res$diagonal_share, " V:", res$morie_cramers_v, "\n") ``` ## Mandela classification `mrm_classify_mandela()` shipped in v0.1.14 and remains the canonical Mandela classifier in v0.2.0. It supports three operationalisations: ```{r mandela} mrm_classify_mandela(b01, denominator = "row") # per-placement mrm_classify_mandela(b01, denominator = "individual_any") # per-person mrm_classify_mandela(b01, denominator = "individual_cumulative") ``` The provincial-canonical 12.5/16.5/20.6 % torture rates from c11 require the `c11` aggregate (loaded via `morie_sample("otis_c11")`); see the MRM empirical paper \u00a76. # TPS suite ## Levy-flight Hill exponent on inter-event step lengths Treats consecutive events in chronological order as a single stream and computes the haversine inter-event step length (km). Returns the Hill-MLE exponent restricted to steps above `min_step_km`. ```{r tps-levy} mrm_tps_levy_scaling(tps) ``` ## Moran's I + DBSCAN clustering Grids the WGS84 extent into a coarse raster, counts events per cell, and computes the global Moran's I via a rook-contiguity matrix. Also runs DBSCAN on the raw lat/long points (rescaled to km) for cluster counts. ```{r tps-moran} mrm_tps_moran_clustering(tps, grid_resolution = 20L) ``` For the high-precision computation on the full 254,378-event Assault file, use the `morie` Python `tps_spatial_advanced` pipeline; the R version is for quick interactive auditing. ## Neighbourhood inter-event recurrence For each `HOOD_158` neighbourhood, sorts events chronologically and computes the gap (in days) between consecutive events. ```{r tps-recur} head(mrm_tps_neighbourhood_recurrence_km(tps)) ``` ## Hawkes manifest loader `mrm_tps_load_hawkes_refit(path)` reads `paper_hawkes_refit.json` (the per-category Hawkes refit table from the MRM empirical paper \u00a77.1-7.2) and returns it as a tidy data.frame. The reference manifest ships with the package; the loader defaults to it (no path argument needed). # SIU suite The SIU callables operate on the SIU.csv file produced by `morie_fetch_siu()` (an on-demand scraper of the public Director's Reports). The scraped corpus is not shipped, but the callables themselves do not depend on shipped data. ```{r siu, eval = FALSE} siu_path <- morie_fetch_siu() siu <- read.csv(siu_path) res <- mrm_siu_case_to_decision_km(siu) print(res$pooled) head(res$by_service[order(-res$by_service$n),]) mrm_siu_per_service_rate(siu) mrm_siu_outcome_classifier(siu) ``` The verified pooled median in our test snapshot is **120 days from incident to Director's decision** (n = 1,711 cases). Per-service medians cluster tightly around 120, indicating a system-wide processing cadence rather than a per-jurisdiction effect. # References * MRM theoretical paper \u2014 Ruhela (2026), *MRM: Multilevel Reconciliation Methodology --- A Multi-Source Statistical Foundation for Canadian Carceral, Police, and Oversight Data*. * MRM empirical paper \u2014 Ruhela (2026), *Solitary Confinement, Self-Excitation, and Institutional Churn: Empirical Applications of MRM to Canadian Carceral and Police Data*. * OTIS data dictionary \u2014 `data.ontario.ca/dataset/data-on-inmates-in-ontario`. * Toronto Police Open Data \u2014 `data.torontopolice.on.ca/`. * SIU public Director's Reports \u2014 `siu.on.ca/en/case_directors_reports.php`.