--- title: "Dataset catalogue" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Dataset catalogue} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = requireNamespace("morie", quietly = TRUE) ) ``` # Overview MORIE ships a portable SQLite layer with dozens of built-in datasets covering Canadian carceral, police, oversight, and public- health surveillance corpora. This vignette shows how to discover and load them from R. # Browsing the catalogue ```{r catalog, eval = FALSE} library(morie) catalog <- morie_dataset_catalog() head(catalog) ``` Each row of the returned data frame describes one dataset: identifier (e.g. `otis-2025`, `cpads-2122`), source, year, number of rows, and a short description. # Per-dataset detail ```{r info, eval = FALSE} morie_dataset_info("cpads-2122") ``` `morie_dataset_info()` returns a list with the variable names, labels, value codes, citation, and any data-acknowledgment disclaimer required by the original publisher. # Loading data ```{r load, eval = FALSE} df <- morie_load_dataset("cpads-2122") dim(df) ``` `morie_load_dataset()` returns a tibble. Public-use datasets shipped inside the package require no further configuration. # Configuring local + remote backends For datasets backed by external SQLite mirrors: - Set `MORIE_LOCAL_DB_DIR` to a directory of `.sqlite` files for fast offline access. - Set `MORIE_REMOTE_URL` to an HTTP SQL-over-REST endpoint for network-only exploration. # Statistics Canada / Health Canada data acknowledgment Several datasets in the catalogue are derived from Statistics Canada and Health Canada PUMFs (CCS, CSADS, CSUS, CADS, CPADS). The standard disclaimer applies: although the analyses use Statistics Canada / Health Canada data, the analyses, interpretations, and conclusions are those of the analyst and do not represent the views of either agency. # Where to go next - The `intro` vignette uses `morie_load_dataset()` end-to-end. - The `cpads-canonicalization` vignette covers the CPADS column contract and `morie_canonicalize_cpads_data()` helpers.