--- title: "Auditing a system for group disparity" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Auditing a system for group disparity} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>") ``` ## What this vignette covers morie 0.8.0 adds a **fairness & disparity-audit** subsystem. It does not build or deploy a predictive-policing or risk-assessment system — it *measures* whether an existing one treats demographic groups differently, so the system can be held accountable. This vignette walks through the R surface: the group-disparity metrics and the predictive-policing calibration audit. ```{r} library(morie) ``` ## Group-disparity metrics Suppose a risk-flagging system has flagged a set of individuals. Group `"A"` is flagged at 100%, group `"B"` at only 60%. ```{r} flagged <- c(rep(1, 5), 1, 1, 1, 0, 0) # 10 individuals race <- c(rep("A", 5), rep("B", 5)) di <- morie_fairness_disparate_impact(flagged, race, privileged = "A") di$value # disparate-impact ratio di$adverse_impact # below the 0.80 four-fifths threshold? ``` A disparate-impact ratio below 0.80 is the standard legal indicator of *adverse impact*. The demographic-parity gap reports the same disparity as an additive difference: ```{r} morie_fairness_demographic_parity(flagged, race, privileged = "A")$value ``` When ground-truth outcomes are available, `morie_fairness_equalized_odds` audits the system's *error rates*, not just its flagging rates: ```{r} truth <- c(1, 0, 1, 0, 1, 0, 1, 0) pred <- c(1, 0, 1, 0, 1, 1, 0, 1) grp <- c(rep("A", 4), rep("B", 4)) morie_fairness_equalized_odds(truth, pred, grp, privileged = "A")$violation ``` ## Predictive-policing calibration audit `morie_predpol_calibration_audit` generalises the SciencesPo *Predictive-policing-Chicago* district analysis: it ranks areas by the risk an algorithm *predicts* and by their *realised* outcome rate, then tests whether the disagreement tracks the areas' demographics. ```{r} areas <- c("d1", "d2", "d3", "d4", "d5", "d6") mean_risk <- c(90, 80, 70, 30, 20, 10) # the algorithm's ranking outcome_rate <- c(10, 20, 30, 70, 80, 90) # realised — the opposite area_group <- c("X", "X", "X", "Y", "Y", "Y") audit <- morie_predpol_calibration_audit(areas, mean_risk, outcome_rate, area_group) audit$group_rank_gap # per-group mean rank gap audit$interpretation ``` A positive rank gap means the algorithm ranks that group's areas *more* dangerous than their realised outcomes warrant — the signature of disparate over-policing. ## Further reading * `?frns_metrics` — all six disparity metrics. * `?frns_predpol` — the predictive-policing calibration audit. * `?frns_temporal` — the multi-city temporal audit. The methods are clean-room reimplementations from published descriptions (IBM AIF360; the SciencesPo Predictive-policing-Chicago project; Barman & Barman, arXiv:2603.18987; the COMPAS *XAI Stories* audit).