---
title: "Auditing a system for group disparity"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Auditing a system for group disparity}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```

## What this vignette covers

morie 0.8.0 adds a **fairness & disparity-audit** subsystem. It does not
build or deploy a predictive-policing or risk-assessment system — it
*measures* whether an existing one treats demographic groups
differently, so the system can be held accountable.

This vignette walks through the R surface: the group-disparity metrics
and the predictive-policing calibration audit.

```{r}
library(morie)
```

## Group-disparity metrics

Suppose a risk-flagging system has flagged a set of individuals. Group
`"A"` is flagged at 100%, group `"B"` at only 60%.

```{r}
flagged <- c(rep(1, 5), 1, 1, 1, 0, 0)        # 10 individuals
race    <- c(rep("A", 5), rep("B", 5))

di <- morie_fairness_disparate_impact(flagged, race, privileged = "A")
di$value            # disparate-impact ratio
di$adverse_impact   # below the 0.80 four-fifths threshold?
```

A disparate-impact ratio below 0.80 is the standard legal indicator of
*adverse impact*. The demographic-parity gap reports the same disparity
as an additive difference:

```{r}
morie_fairness_demographic_parity(flagged, race, privileged = "A")$value
```

When ground-truth outcomes are available, `morie_fairness_equalized_odds`
audits the system's *error rates*, not just its flagging rates:

```{r}
truth <- c(1, 0, 1, 0, 1, 0, 1, 0)
pred  <- c(1, 0, 1, 0, 1, 1, 0, 1)
grp   <- c(rep("A", 4), rep("B", 4))
morie_fairness_equalized_odds(truth, pred, grp, privileged = "A")$violation
```

## Predictive-policing calibration audit

`morie_predpol_calibration_audit` generalises the SciencesPo
*Predictive-policing-Chicago* district analysis: it ranks areas by the
risk an algorithm *predicts* and by their *realised* outcome rate, then
tests whether the disagreement tracks the areas' demographics.

```{r}
areas        <- c("d1", "d2", "d3", "d4", "d5", "d6")
mean_risk    <- c(90, 80, 70, 30, 20, 10)   # the algorithm's ranking
outcome_rate <- c(10, 20, 30, 70, 80, 90)   # realised — the opposite
area_group   <- c("X", "X", "X", "Y", "Y", "Y")

audit <- morie_predpol_calibration_audit(areas, mean_risk, outcome_rate,
                                   area_group)
audit$group_rank_gap     # per-group mean rank gap
audit$interpretation
```

A positive rank gap means the algorithm ranks that group's areas *more*
dangerous than their realised outcomes warrant — the signature of
disparate over-policing.

## Further reading

* `?frns_metrics` — all six disparity metrics.
* `?frns_predpol` — the predictive-policing calibration audit.
* `?frns_temporal` — the multi-city temporal audit.

The methods are clean-room reimplementations from published
descriptions (IBM AIF360; the SciencesPo Predictive-policing-Chicago
project; Barman & Barman, arXiv:2603.18987; the COMPAS *XAI Stories*
audit).