Changes in version 2026-05-25 (2026-05-25) CI: drop fwildclusterboot (pak recursive Remotes unreliable) (3MMM.40c) - Removed fwildclusterboot from Suggests and removed the .morie_did_have_fwildboot() helper + the if (.morie_did_have_fwildboot()) { fwildclusterboot::boottest(...) } branch in morie_did_wild_cluster_bootstrap(). The function now goes straight to the base-R Rademacher/Webb wild-cluster bootstrap (which already existed as the fallback and mirrors the Python implementation; no math change for any caller). - Reason: pak's resolver does not reliably recurse through a Remote's own Remotes. 3MMM.40 added s3alfisc/fwildclusterboot and 3MMM.40b added s3alfisc/summclust, but the resolver still reported summclust: Can't find package called summclust -- so the recursive-Remote pattern is structurally fragile. Following the same "drop optional CRAN-archived/GitHub-only deps" pattern used for rdd in 3MMM.40. - Remotes: now lists only synth-inference/synthdid, which has no GitHub-only transitive Imports. CI: pak resolver -- transitive Remote for summclust (3MMM.40b) - Added s3alfisc/summclust to DESCRIPTION Remotes:. fwildclusterboot Imports summclust, which is also GitHub-only (never on CRAN). 3MMM.40 added the fwildclusterboot Remote but pak's recursive resolver still failed one level deeper because it does not auto-recurse through a Remote's own DESCRIPTION. summclust's Imports (utils, dreamerr, MASS, collapse, generics, cli, rlang) are all on CRAN, so the chain terminates here. R CMD check ERROR fixes (3MMM.39) - R/datasets.R (6 sites): dropped the invalid n = 2L argument from strsplit(). Base R's strsplit() has no n=; the call silently ignored it on most R versions but errors on R-devel. - R/dataset_load_by_key.R: removed the spurious max_features = max_features argument from the morie_datasets_ontario_ckan_by_key() dispatch. That function's formals are only (dataset_key, offline, resource_id) -- passing the unused formal caused a hard error in the dispatcher example. - R/ingest_statcan.R: replaced the non-existent cansim::set_cansim_api_key(api_key) call with the documented mechanism. cansim has no such helper in any current CRAN release; it reads CANSIM_API_KEY from the environment. morie_ingest_statcan_cansim() now mirrors a user-supplied STATCAN_API_KEY into CANSIM_API_KEY when only the morie alias is set. - R/spatial_voting.R::mlsmu6: added is.finite(prev_stress) guard for the convergence check. prev_stress starts as Inf, so iter 1's abs(Inf - stress) / max(Inf, 1e-12) = NaN triggered "missing value where TRUE/FALSE needed" and broke the \examples{} block. The first iteration now skips the convergence check cleanly; iter 2+ uses real values. - Added a proper roxygen block for morie_dataset_portal_catalog() (only the @export tag was present; the docstring upstream was attached to the sibling _clear_cache helper). - man/morie_dataset_portal_catalog.Rd and man/morie_entheo_clone_dmt_imaging.Rd regenerated. CI: setup-r-dependencies pak resolver unblocked (3MMM.40) - Dropped rdd from Suggests. CRAN archived it in 2024 and pak could no longer resolve it. The only morie callsite (morie_rdd_mccrary()) used it as a fallback when rddensity wasn't installed; rddensity itself is on CRAN and in Suggests, so the rdd branch was effectively dead code in any realistic configuration. - Added Remotes: s3alfisc/fwildclusterboot, synth-inference/synthdid so pak can fetch the two remaining GitHub-only Suggests when building the lockfile. Both upstream repositories are verified live (HTTP 200 from api.github.com/repos/...). Changes in version 2026-05-24 (2026-05-24) Correctness recovery: math typesetting restored Phase 3LLL reverses the destructive \eqn{LATEX} -> \code{LATEX} swap shipped in commit f399ec41a (Phase 3KKK1+2). That swap eliminated the "Lost-braces" warning but at the cost of stripping LaTeX math typesetting from the PDF/HTML manual and turning every greek letter, \hat, \sum, \frac, etc. into an "unknown macro" warning at R CMD check. The proper Rd-compliant fix is the two-argument form: \eqn{LATEX}{ASCII fallback} # inline \deqn{LATEX}{ASCII fallback} # display Every affected line (104 R files) now uses this form, preserving PDF math while satisfying the Rd parser. Driven by fix_rd_math.py, a LaTeX->ASCII transformer covering the common Greek alphabet, operators (\sum, \int, \hat, \bar, \frac, \sqrt), and relation symbols. Auto-install helper for optional dependencies New morie_install_extras() lets users install the ~50 optional Suggests: packages in one call. CRAN policy forbids install.packages() at .onLoad() time, so morie ships an opt-in helper instead. Three modes: morie_install_extras() # missing only (default) morie_install_extras("all", ask = FALSE) # everything, CI-safe morie_install_extras(c("hawkes", "sf")) # named subset The helper also probes for the C system libraries libcurl, libsodium, and liboqs and prints platform-specific install hints when any are missing. System libraries must be installed BEFORE re-installing morie so the configure-time probes link the C/C++ backends against them. Bulk open-data catalog explosion Cross-portal morie_dataset_portal_catalog() grows from ~1,044 rows to 9,242 rows across 14 portals. Every Socrata / CKAN / ArcGIS Hub / Opendatasoft portal morie touches now has its full public catalog bundled offline. Phase 3GGG -- 6-portal bulk harvest - 3GGG1: NYC OpenData -- 2851 entities (2395 datasets + 294 maps + 162 filters/charts/hrefs/stories). - 3GGG2: Chicago Open Data -- 1856 entities. - 3GGG3: Toronto Open Data CKAN -- 540 packages. - 3GGG4: Calgary (933) + Edmonton (2027) Socrata. - 3GGG5: Ottawa Open Data Hub -- 287 datasets (via OGC startindex= pagination, not Socrata offset=). - Replaced the per-portal crime-adjacent subset catalogs from 3EEE2/3FFF3 with the bulk variants (no API change -- the small curated catalogs are still callable via the older loader names for backwards compat). - New generic Socrata-by-id wrappers: morie_datasets_nyc_socrata_by_id() + morie_datasets_chicago_socrata_by_id() (mirror the 3FFF3 Calgary/Edmonton pattern). morie_datasets_load_by_key() routes chicago + nyc_opendata sources through them; max_features now threads as the SODA $limit. Phase 3HHH -- full catalogs for the last two portals - 3HHH1: Montreal Open Data CKAN full bulk -- 401 packages (up from the 23-row Loi/Justice/Securite subset from 3EEE1). - 3HHH2: Vancouver Opendatasoft v2.1 full bulk -- 190 datasets with enriched schema (publisher, theme, license, records_count added to the 3CCC4 fixture). Catalog totals calgary_opendata 933 nyc_opendata 2861 chicago 1864 ontario_ckan 38 edmonton_opendata 2027 ottawa_opendata 287 montreal_opendata 401 statcan_ccjs 10 nyc_nypd 8 toronto_opendata 540 tps_arcgis_hub 71 tps_psdp 11 vancouver_opendata 190 vpd_geodash 1 ------- 9242 Bundled fixture footprint: ~3.4 MB of catalog metadata; per-row unwound this is the metadata equivalent of every NYC dataset descriptor + every CKAN package summary + every Hub item -- offline queryable via morie_datasets_browse(keyword=...). Cross-portal open-data infrastructure Major sprint adding 14 open-data portals + a unified browse/load interface. The cross-portal morie_dataset_portal_catalog() now spans 9 cities + 1 federal source + ~800 dataset entries across 4 different API protocols. Phase 3CCC -- NYC + TPS deep coverage - 3CCC1: NYPD law_code resolver. New morie_datasets_nyc_nypd_law_books() (46-row statute book -> human name + jurisdiction dict; PL, VTL, CPL, ABC, AC, COR, AM, PHL, ED, GB, GCI, HTH, PAR, LOC, FOA, RR, TAX, RPA, RP, PRL, TWN, ...) + morie_parse_nypd_law_code() vectorised regex parser. Added as 4th resolver in morie_datasets_nyc_nypd_resolved(). - 3CCC2: NYC multi-boundary loader bundle -- 5 new fixtures (school districts / council districts / community districts / NTAs 2020 / ZCTAs) + morie_datasets_nyc_boundaries_catalog() unified index. - 3CCC3: TPS Hub resolved-joins analyzer (morie_datasets_tps_psdp_resolved()) -- division + hood158 + hood140 + NIA + psdp_class 5-way join, mirrors the Chicago / NYPD _resolved() patterns. Plus morie_datasets_tps_police_divisions() (16 post-amalgamation TPS divisions). - 3CCC4: cross-portal morie_dataset_portal_catalog() -- 7 initial portals, 336 datasets, uniform schema (dataset_key, source, id, api_modes, loader, dict_url, n_rows_bundled). Added Vancouver Open Data (Opendatasoft v2.1, 190 datasets). Folded SODA3-auth note into the SODA3 helper docstring per Socrata support article 34730618169623. Phase 3DDD -- Canadian municipal + federal coverage - 3DDD1: 5 Vancouver crime-adjacent civic loaders -- graffiti (100 / 7683), noise control areas (3), homeless shelters (17), property use inspection districts (23), fire halls (20). - 3DDD2: VPD GeoDASH crime loader. T&Cs gate auto-download, so morie ships a stratified 550-row sample (50 x 11 TYPE categories) + bundled legal disclaimer + user-zip_path = mode for the full 915k-row feed. - 3DDD3: Statistics Canada CCJS / CODR WDS REST API. 10-cube registry covering federal crime + corrections; morie_datasets_statcan_cube_metadata() + morie_datasets_statcan_vectors() + morie_datasets_statcan_full_csv_url() wrappers. - 3DDD4: morie_datasets_browse() + morie_datasets_summary() -- filter the cross-portal catalog by keyword / portal / api_mode / loader regex with AND-composable predicates. Phase 3EEE -- Montreal + expanded Toronto/Vancouver + dispatcher - 3EEE1: Montreal Open Data CKAN -- 23-row Loi/Justice/ Securite catalog + SIM (fire/EMS) interventions flagship loader with 349-row stratified bundled sample + 170-row INCIDENT_TYPE_DESC dict + generic CKAN dispatcher. - 3EEE2: Toronto Open Data CKAN beyond TPS Hub -- 208-row crime-adjacent catalog + ambulance stations + TPS ASR misc aggregates + generic CKAN dispatcher. - 3EEE3: Vancouver Open Data deeper coverage -- 4 more fixtures (community centres, food markets, disability parking, public art). - 3EEE4: morie_datasets_load_by_key() -- single dispatcher resolving any catalog dataset_key to its loader across all portals. Phase 3FFF -- dispatcher hardening + prairie cities - 3FFF1: CKAN package_show -> first-CSV resource auto-resolution. MTL + TO generic CKAN keys now Just Work through morie_datasets_load_by_key(). - 3FFF2: mode = c("auto","soda2","soda3","odata") + app_token args on the dispatcher; routes through SODA3 for Socrata-backed sources, silently ignored elsewhere. - 3FFF3: Calgary + Edmonton + Ottawa loaders. Calgary + Edmonton are Socrata (data.calgary.ca, data.edmonton.ca); Ottawa is ArcGIS Hub (open.ottawa.ca, dispatches through the existing 3SS+ generic ArcGIS pipeline). Crime-adjacent catalogs - per-dataset bundled fixtures + generic Socrata-by-id dispatchers. Catalog totals (across 14 portals) chicago 8 ontario_ckan 38 nyc_nypd 8 vancouver_opendata 190 nyc_opendata 10 vpd_geodash 1 tps_arcgis_hub 71 statcan_ccjs 10 tps_psdp 11 montreal_opendata 23 toronto_opendata 208 calgary_opendata 157 edmonton_opendata 195 ottawa_opendata 106 Total ~ 1044 catalog rows. Changes in version 2026-05-23 (2026-05-23) Formula corrections (affect Python AND R sibling identically): - iv.morie_iv_wald / iv.wald_estimator Wald-LATE delta-method SE previously omitted the Cov(num, den) term, biasing the SE under realistic Y-D correlation. Now includes - 2*(num/den^3) * cov(y, d) / n per-stratum aggregation. - dsp_waveform.morie_dsp_higuchi_fd / _waveform.higuchi_fd fractal dimension previously summed M-1 differences instead of M (Higuchi 1988 eq 1 specifies floor((N-m)/k) summands). Fixed by using M+1 indices so diff() yields M terms. R-side feature additions: - 4 new RcppArmadillo C++ kernel files (src/morie_hawkes.cpp, morie_dsp.cpp, morie_matching.cpp, morie_spatial.cpp) exposing 14 // [[Rcpp::export]] symbols. - R wrappers in R/{tps_hawkes_advanced,dsp_filters,matching, spatial_voting}.R now dispatch to the C++ kernels when the compiled .so is loaded, falling back to pure-R otherwise. - DESCRIPTION: LinkingTo: Rcpp, RcppArmadillo (was: Rcpp). Other fixes carried from the 5-layer review on 2026-05-22 (all Python-parity-verified before applying): - R/survival.R .validate_te now returns ok mask; KM/HR/concordance callers re-align group/risk_score by mask instead of seq_along. - R/iv.R JIVE projects only the endogenous columns (was: every column including intercept and exogenous controls), matching src/morie/iv.py:1604-1613. - R/did.R morie_did_aggregate_gt_att SE uses k = cell count (was: nrow(g), equivalent only when nrow(g)==1). - R/did.R morie_did_test_parallel_trends returns joint_chi2 + joint_df, keeps joint_f_stat as alias. - R/inference.R Clopper-Pearson exact CI handles successes==0 and successes==n edges instead of calling qbeta(., 0, .). - R/weights.R morie_weights_brr warns on odd-size strata. - R/spatial_voting.R Hare 2018 + King 2003 citation corrections. - R/tps_statphysics.R Helbing 2010 venue corrected (NJP not PNAS). Earlier from 2026-05-22 marathon (already in 0.9.5.6 in tree): - Cox-Snell residuals use per-row y[,"status"] not scalar nevent. - JKn replicate weights rewritten to Wolter 2007 form (one PSU per replicate, scale survivors by n_h/(n_h-1)); aggregator uses ((n_h-1)/n_h)*sum_{i in h} diffs_sq_i. - Mann-Whitney effect size r = Z/sqrt(n1+n2) (was: n1*n2). - Li-Ji n_effective_tests sums fractional part for all eigenvalues. - Sampling proportional alloc keeps stratum names so weights aren't NA. - Abadie-Imbens SE splits by treatment, denom is n_treated^2. - tps_statphysics inspection-game payoff matrix transposed back to match Python convention. Changes in version 2026-05-22 (2026-05-22) R-side describe() parity closure. Patch release that closes one of the two parity gaps named in v0.9.5.4: the pedagogical narratives that the Python sibling exposes via morie.describe() are now available on the R side via morie_describe() and the string-only variant morie_describe_by_name(). R API additions: - morie_describe(callable) — takes a function object OR a character scalar (with or without the morie_ prefix). Prints the pedagogical narrative for the named callable. - morie_describe_by_name(name) — string-only variant. Bundled data: - inst/extdata/describe_corpus.Rds — a single xz-compressed Rds (~1.6 MB on disk) containing 36,433 named character entries. Names are the callable mnemonics (the 4-7 character forms); values are the markdown narrative bodies sourced from src/morie/fn/describe_.md. The Rds is loaded once per session and cached in a package-private environment. Build tooling: - tools/bundle-describe-files.R — re-runs the Python-to-R sync when src/morie/fn/describe_*.md changes. Run from the repo root with Rscript tools/bundle-describe-files.R. Tests: - tests/testthat/test-describe.R — 17 tests covering lookup, prefix stripping, .md extension stripping, unknown-name diagnostics, type-rejection, function-object capture via substitute(), and cache identity across calls. All pass on the development build. Remaining parity gap: - morie.crypto educational primitives ship on the Python side only; a native R + Rcpp port (ML-KEM, Dilithium, NTRU, McEliece, ECC, hybrid PQC) is planned for v1.0.0. Calling into the Python side via reticulate is not added in v0.9.5.5; the scope was set at the native-R port path, which is a larger arc and the natural place for a v1.0.0 milestone. Changes in version 2026-05-21 (2026-05-21) Doob → MRM chi-square rename. Patch release with deprecation aliases; no breaking changes for existing user code. Naming: - The internal name 'Doob chi-square family' is renamed 'MRM chi-square family' across all morie code, Sphinx docs (architecture, mrm_modules, siuiap), and the rootcoder007 profile README. The Sprott-Doob-Iftene author-pair citation in papers/ is preserved; the src/morie/sprott_doob.py and src/morie/doob_trends.py author-named modules are also preserved. Python API (with deprecation aliases): - morie.otis_all_analyze.analyze_c_doob_chi2() -> analyze_c_chi2() - morie.otis_all_analyze.analyze_d_doob_chi2() -> analyze_d_chi2() Old names still work but emit DeprecationWarning. They will be removed in a future release; update callers at your convenience. R side: no R API changes; the R chi-square family was already renamed in v0.9.5 (vignette chi-square-and-anova.Rmd). Patch release over 0.9.5.2. - Declare pkgload in Suggests:. The pkgload skip-guard added in 0.9.5.2's test-cov-fallbacks.R used pkgload::dev_packages() without declaring the package in DESCRIPTION's Suggests:, producing a '::' or ':::' import not declared from: 'pkgload' WARNING under R CMD check. No user-visible functional change; the warning is informational, but it should not have shipped in 0.9.5.2. - 0.9.5.2 has been yanked from PyPI as a consequence of the above WARNING and to keep the public release record clean. - HTML validation fix. morie_siu_sanity_check's description used date_*_iso and number_of_* as inline text, which roxygen2's markdown mode rendered as nested \emph{\emph{...}} in the generated Rd and as nested in the HTML manual. win-builder flagged this as an HTML validation NOTE. Wrapping the identifiers in backticks (now rendered as \verb{...}) resolves it. - All other fixes are inherited from 0.9.5.1: see entry below. CRAN Policy: full cache-leak fix (supersedes 0.9.5 which was uploaded to win-builder with incomplete cache-isolation). - morie_db_connect() default cache-dir flipped from tools::R_user_dir("morie", "cache") to a session-scoped tempdir() subdirectory; matches the convention already set for morie_fetch_siu() and morie_fetch_tps() in 0.9.5. Now no morie function writes outside tempdir() unless the user explicitly opts in by passing db_path = morie_cache_dir(...) or cache_dir = morie_cache_dir(...). - New morie_cache_clear(subdir, confirm) user-facing function for actively-managing the persistent cache (CRAN Policy requirement for R_user_dir caches). - morie_cache_dir(subdir) is now exported with a subdir argument so users can compose per-subsystem persistent paths. - 3 morie_cache_* examples (store, load, list) now use explicit db_path = tempfile() so R CMD check never writes outside tempdir(). - morie_check_plugin_license error-path example moved from \donttest{} to \dontrun{} (intentionally errors when passed an incompatible SPDX). - morie_fetch placeholder-URL example moved from \donttest{} to \dontrun{} (example.org doesn't host CSV; the URL is a documentation placeholder). - Two crimsl.utoronto.ca references in R/mandela.R and R/rmorie-package.R rewritten as plain-text references; the U of T web server returns 403 to win-builder's IP even though the URLs are publicly reachable from browsers. - New inst/WORDLIST listing real technical terms (AIPW, ATC, ATT, CATE, Hawkes, MRM, etc.) so the win-builder spell-checker no longer flags them. Documentation + CI hardening (added 2026-05-21 to the v0.9.5 release branch alongside the SIU + rename work): - New SIU vignette (vignettes/siu-pipeline.Rmd) — end-to-end walkthrough of morie_fetch_siu(), morie_siu_audit_case(), morie_siu_anomaly_check(), morie_siu_compare(), morie_siu_llm_extract(), morie_siu_translate(), and the canonical-override system. 14 vignettes total now. - Chi-square vignette correction. vignettes/chi-square-and-anova.Rmd previously called the MRM chi-square family the "Doob $\chi^{2}$ family", which incorrectly singled out one of the three named authors (Sprott, Doob, Iftene) of the source contingency tables. Renamed to "MRM chi-square family". The Sprott / Doob / Iftene author citation to the source tables is unchanged. - _pkgdown.yml shipped — a minimal pkgdown configuration so contributors can build a local documentation site with pkgdown::build_site(). The file is .Rbuildignored so it doesn't ship in the CRAN tarball. - README rewrite (top-level + R-package) to reflect v0.9.5 reality: 559 morie-prefixed exports (not 87), the SIU subsystem, free-first AI helpers (Ollama default), language-aware DRID manifest, canonical-override system, polite-by-default fetcher, and the green 6-cell R CMD check matrix. - pkgcheck workflow: inconsolata LaTeX font installed. pkgcheck's internal rcmdcheck builds the PDF manual, which needs inconsolata.sty. Without it pkgcheck reported a spurious "R CMD check found 1 warning" against a package that has 0 warnings in the dedicated r-cmd-check.yml matrix. The pkgcheck job now installs tinytex + inconsolata before running. lintr / goodpractice cleanups: - The Hawkes C++ likelihood functions now use T_horizon instead of T for the time-horizon parameter, so the auto-generated R/RcppExports.R no longer trips R linters that flag T as a potential TRUE shadow. The math convention is preserved in the C++ docstrings; only the parameter NAME changed. - setwd() in morie_run_workflow_step() replaced with withr::local_dir() (goodpractice no-setwd linter). - 352+ exported functions renamed to the morie_* prefix so they no longer collide with same-named functions in other CRAN packages. Examples: chi_square_test → morie_chi_square_test, kmeans_clustering → morie_kmeans_clustering, etc. Names that were already morie-specific cryptic abbreviations (agset, brdgr, fzhdc, …) are unchanged. SIU harvester: polite by default, manifest-aware, retry-aware, and auditable against the original published reports. - Persistent HTML cache + per-case audit. morie_fetch_siu(cache_html = TRUE) saves every fetched report and news-release page under /html/ (gzipped, ~80-100 MB for a full sweep). The saved HTML is the canonical ground truth for every row in the emitted CSV: any later question of the form "did the parser get this field right?" is decidable by reading the cached page for that case. morie_siu_audit_case(case_number) returns the parser's 1-row data frame, the raw report and news HTML, and HTML-stripped plain text for both, all from cache when available. - morie_siu_compare() — line up the parser's output for a case against a user-supplied external table (column map and case key are caller-controlled) and show the surrounding report HTML excerpt for each disagreement. No external source is treated as authoritative; the function exists so the user can adjudicate parser-vs-external mismatches against the actual published report. The published report HTML is the only ground truth morie recognises for SIU fields. - Free by default. The LLM helpers now default to \code{model = c("ollama", "gemini")} -- a free local Ollama model first, with paid Gemini as fallback only if Ollama is unavailable. Users who install Ollama and pull a free Gemma / Qwen / Llama / Functiongemma variant (\code{ollama pull gemma3:4b}) get the full second-coder / audit / anomaly-check stack at $0 ongoing cost. \code{OLLAMA_HOST} defaults to \code{http://localhost:11434} when unset, so the zero-config path is just "install ollama, pull a model, done". - AI second-coder (Gemini / Claude / Ollama). morie_siu_llm_extract(case_number, model = "gemini") sends the cached report HTML through a large-language-model endpoint and returns the same 64-column row format as the C++ parser, so it drops straight into morie_siu_compare(external = ...) for an independent diff. model accepts a character vector for fail-over, e.g. c("gemini", "ollama") uses the paid Gemini endpoint when available and silently falls back to a local / free Ollama-compatible model otherwise. Credentials are read from GOOGLE_API_KEY / ANTHROPIC_API_KEY / OLLAMA_HOST; nothing is hard-coded. - morie_siu_translate_fr_to_en() — self-improving SIU. For SIU cases that exist only in French (no English-language paired drid; ~1-2 per year of SIU output), translate the narrative_summary, news_release_summary, news_release_title and relevant_legislation into English via a local Ollama model (default $0 cost, no API key needed) and persist each translation as a canonical override via \code{morie_siu_record_correction()}. Idempotent (skips already-translated cases) and self-improving (every run leaves morie better at returning English content for French-only reports). Maintainers can promote the resulting overrides into the shipped \code{inst/extdata/siu_canonical_overrides.csv.gz} so all users get the English text on the next package update. - French police-service acronyms. The modal-service detector now also recognises SPT (Service de Police de Toronto), PPO (Police provinciale de l'Ontario), SPRH (Halton), SPRY (York), SPRP (Peel), SPRD (Durham), SPRN (Niagara), SPRW (Waterloo), SPO (Ottawa), SPL (London), SPH (Hamilton), SPW (Windsor), SPG (Guelph), SPK (Kingston) and maps each to the canonical English name. Closes the remaining French-only-case gap; 12-TFD-104 in the 2012 corpus now reports \code{Toronto Police Service} correctly. - 99.955% format-clean on the full 2,218-case corpus. Empirical measurement via morie_siu_sanity_check() on the freshly-harvested SIU.csv: 2,217 / 2,218 rows have zero format issues; the lone remaining case is a 2012 French-only report (12-TFD-104) without an English-paired drid. The earlier 95.45% baseline ate four further fixes: (a) Unicode apostrophe / quote / dash normalisation in lower_ascii() so the title- finder matches "Director's report" (U+2019) cleanly, (b) "Overview" as a section_4 fallback for 2014 reports that retitled "The Investigation", (c) French "L'enquête" / "Aperçu" fallbacks for French-only reports, (d) full SIU police-service acronym table (OPP, TPS, HRPS, NRPS, PRP, YRP, DRPS, WRPS, OPS, LPS, WPS, GPS, KPS, BPS, BPPS, CKPS, PRPS, GSPS, SSMPS, SLPS, SPS, TBPS, BPSB) -- old reports use the acronym throughout and never spell out "Ontario Provincial Police", and the modal- service detector now picks up "OPP" → "Ontario Provincial Police" automatically. - Interleaved report + news fetch. morie_fetch_siu() no longer walks the corpus in two strict phases (fetch all reports, then fetch all news). It now uses a rolling-window batched fetcher: each batch of 250 reports fires in the same rate-limited pool as the previous batch's news pages. While the next 250 reports are downloading, the news pages for the nrids we just parsed are downloading alongside. Roughly halves cold-start corpus wall time (~30 min instead of ~58 min on the full 4,700-drid sweep) without changing the per-second rate the SIU site sees. - Canonical overrides — the parser LEARNS from corrections. Every verified \code{(case_number, field, value)} tuple recorded via \code{morie_siu_record_correction()} is applied to \code{morie_fetch_siu()}'s output on subsequent runs. The shipped \code{inst/extdata/siu_canonical_overrides.csv.gz} holds the maintainer-confirmed table (starts empty in v0.9.5, populated by the LLM-audit + human-review workflow over time). The user-side \code{/canonical_overrides.csv} merges in too -- users can fix their local copy without touching the package source. This is morie's "memory": wrong cells get found via \code{morie_siu_sanity_check()} or \code{morie_siu_audit_columns()}, corrected once, and the fix propagates to all users on the next package update -- no C++ rebuild needed. - morie_siu_sanity_check() — fast format-validity pass over every row of an emitted SIU table. Flags case_number that doesn't look like an SIU id, date_iso that isn't ISO 8601, number_of that isn't a positive integer, charges_recommended that isn't "Yes"/"No", page-chrome strings leaked into narrative_summary or other content fields, etc. Returns a data frame ordered worst-first so maintainers can pop the cached HTML for any flagged row and adjudicate. Runs in milliseconds, no network, no LLM, no API key required. - morie_siu_audit_columns() — closed-loop per-column accuracy audit. Runs the anomaly check across many cases and aggregates by field, returning a data frame sorted by agreement rate (worst first) so maintainers can prioritise which regex extraction pattern to fix next. Concrete disagreement examples for each field are attached as the \code{"examples"} attribute. With \code{model = "ollama"} pointed at a local Gemma / Qwen / DeepSeek instance the audit costs zero API spend; chain \code{c("gemini", "ollama")} for paid-first / free-fallback. - morie_siu_anomaly_check() — per-field "does the report support this extraction?" audit. Sends one API call per case (all populated fields batched into a single prompt) and returns a data frame with field, parser_value, verdict (\code{"agree"} / \code{"disagree"} / \code{"unclear"}), and a one-sentence reason. Not authoritative -- the cached HTML is the ground truth -- but a fast way to triage which rows a human should re-read against the report. - Section-text terminator fix (parser correctness). The section_text() helper used to stop only at the next

, so the LAST

block on a page (typically section_8 -- analysis / decision) silently captured everything to end-of-document, including the site's left-nav and footer. This leaked phrases like "First Nations, Inuit and Métis Liaison Program" and Twitter follow links into every report's narrative_summary, supplemental_materials, and mental_health_or_race_indications -- the latter would have tagged every case in Ontario as "First Nation" regardless of the report's actual content. The terminator now also stops at R full parity: adds Python morie.mrm_classify_mandela() as the dual of the R-side rmorie::mrm_classify_mandela() (which had shipped in v0.1.14). All 25 v0.2.0-era callables now exist on both language sides. - Version bumped from 0.1.15 to 0.2.0 to mark the cumulative significance of the empirical-workflow work shipped since v0.1.3: 12 mrm_* callables, ArcGIS REST + on-demand SIU scraper + OTIS CKAN fetchers, four bundled reference samples, the longitudinal-panel simulator, the animated demo entrypoint, the GPL-2.0-only signaling layer with optional kernel module and LSM-style userspace audit daemon, the §"Empirical workflow callables" companion-paper sections, all five companion papers built clean against this release. - Project tracking artefacts added: - VERSION_INVENTORY.csv — every file that carries a version string, its category (CURRENT vs HISTORICAL), and the exact match. - DEPENDENCIES.csv — every Python and R dependency with name, version pin, license, and GPL-2.0-only compatibility. - Adds the MRM empirical-paper callables: mrm_otis_* (5 fns, OTIS), mrm_tps_* (4 fns, TPS), mrm_siu_* (3 fns, SIU), plus mrm_tps_kulldorff_scan (space-time scan with MC permutations). All have R + Python parity. - Adds dataset fetchers: fetch_tps_category (ArcGIS REST) and fetch_siu_cases (on-demand scraper for the Ontario SIU public Director's Reports). OTIS CKAN resource IDs registered for a01/b01/b09/c11; loadable via morie_load_dataset(). - Adds 4 bundled reference samples in inst/extdata/ (random 1000-row b01 + b09 + c11 + tps_assault, ~420 KB total) so the examples run offline. - Adds simulate_longitudinal_panel() — clean-room VAR(L) panel simulator with structured covariance kernels. - Adds a GPL-2.0-only signaling layer: SPDX headers on every new source file, check_plugin_license() runtime guard, optional out-of-tree kernel module (kernel-module/morie.c), optional userspace audit daemon (daemon/morie_lsm.py). - Adds an animated demo: python -m morie.demo showcases every new callable end-to-end on the bundled samples with rich-based spinners + progress bars (DoubleML / Optuna style). - 5 companion papers updated and verified against the new callables: morie-empirical-paper §6 + §7.1-§7.11 every numeric claim verified (15 verification text files in results/). Corrections shipped: Hill α 1.62 → 2.08; SDB 22% → 57%; Hawkes Gamma → Weibull (hawkes-paper abstract typo); KM TTR 210 days → flagged as ID-misreading artefact (actual SIU TTR is 120 days); LISA Assault 2024 quadrants 47/5/4/44 → verified 19/13/17/52. - License declarations harmonised to GPL-2.0-only SPDX (matching the Linux kernel convention) across CITATION.cff, pyproject.toml, both DESCRIPTION files, LICENSING.md, README, kernel module. - Removed "Auto-generated" wording from 6 Sphinx documentation pages per user preference; python -m sphinx rebuilds with cleaner intro prose for the API reference pages. Changes in version 0.9.6 Citation cleanup: remove Zenodo references - README.md: removed the "(DOIs will be re-added once we re-deposit on Zenodo.)" promise. The Zenodo deposits for the morie publication set were taken down; we are not committing to re-depositing. - inst/CITATION: removed the stale "also the R package source on Zenodo" comment. - NEWS.md: removed Zenodo DOI references from the historical 0.9.5.7 changelog entry. - No live DOI strings ever shipped in rmorie; this just removes the language that implied otherwise. Phase 1 hotfix: drop wrappers for CRAN-archived packages - Removed morie_anchors_analyze (Phase 1.l) -- upstream anchors was archived from CRAN on 2022-03-06 (check problems not corrected). pak resolver could not solve the dependency, blocking CI. - Removed morie_causal_mediation (Phase 1.h) -- upstream causalweight was archived from CRAN on 2026-05-18 because its dependency LARF was archived. Same pak resolver failure. - DESCRIPTION: Suggests -= anchors, causalweight (35 wrapper extenders remain across Phases 1.k-1.n). - Both wrappers can be restored from git history (a9469ec, 4d78188) if the upstream packages return to CRAN. Phase 1.n: FDR/nonparam extenders -- locfdr / fdrtool / quantreg / np / dirichletprocess / lcmm - New file R/extenders_nonparam.R with 6 wrappers (morie_locfdr_estimate, morie_fdr_qvalues, morie_quantile_reg, morie_np_kernel_reg, morie_dp_gaussian_mixture, morie_lcmm_latent_class) - DESCRIPTION: Suggests += dirichletprocess, fdrtool, lcmm, locfdr, np, quantreg - Tests in tests/testthat/test-extenders-nonparam.R Phase 1.m: spatial/multivariate extenders -- gstat / copula / kernlab / metafor / mvtnorm - New file R/extenders_spatial.R with 9 wrappers covering variograms, kriging, copulas, kernel PCA, spectral clustering, meta-analysis, multivariate normal sampling and CDF (morie_geostat_variogram, morie_geostat_krige, morie_copula_fit, morie_copula_sample, morie_kernel_pca, morie_spectral_cluster, morie_meta_rma, morie_mvnorm_sample, morie_mvnorm_pmv) - DESCRIPTION: Suggests += copula, gstat, kernlab, metafor, mvtnorm - Tests in tests/testthat/test-extenders-spatial.R Phase 1.l: RDD/IRT extenders -- rddensity / rdlocrand / rdpower / anchors / anominate - New file R/extenders_rdd.R with 5 wrappers (morie_rdd_density_test, morie_rdd_local_randinf, morie_rdd_power_calc, morie_anchors_analyze, morie_anominate_ideal_points) - DESCRIPTION: Suggests += anchors, anominate, rdlocrand, rdpower (rddensity was already listed) - Tests in tests/testthat/test-extenders-rdd.R - Note: the rdpower wrapper is exported as morie_rdd_power_calc because morie_rdd_power already exists in R/rdd.R as a closed-form analytical power formula taking (n, tau, sigma); the simulation-based rdpower::rdpower surface is preserved alongside. Phase 1.k: stats extenders -- DescTools / performance / ppcor / coin / randtests A new file R/extenders_stats.R adds 17 wrapper-as-extender entry points under the canonical morie__* prefix that delegate to five CRAN statistics packages. Each function follows the requireNamespace-guarded hard-error pattern used by the other 1.g/1.h/1.i/1.j extenders and returns a thin two-slot list with $method (qualified upstream name) and $raw (upstream object): - morie_desc_cramers_v, morie_desc_kappa, morie_desc_winsorize, morie_desc_gini, morie_desc_atkinson -> DescTools::{CramerV, CohenKappa/KappaM, Winsorize, Gini, Atkinson}. - morie_performance_check_model, morie_performance_r2, morie_performance_check_collinearity, morie_performance_check_outliers -> performance::{check_model, r2, check_collinearity, check_outliers}. - morie_ppcor_partial, morie_ppcor_semipartial -> ppcor::{pcor, pcor.test, spcor, spcor.test} (matrix-wise when y/z are omitted, single-triple test otherwise). - morie_coin_independence, morie_coin_wilcoxon, morie_coin_oneway -> coin::{independence_test, wilcox_test, oneway_test}. - morie_randtests_runs, morie_randtests_turning_point, morie_randtests_bartels -> randtests::{runs.test, turning.point.test, bartels.rank.test}. DESCRIPTION: adds DescTools, ppcor, randtests to Suggests (alphabetised). coin and performance were already listed. Tests: tests/testthat/test-extenders-stats.R covers one happy path per function, each gated by skip_if_not_installed(). Phase 1.g gap: TwoWayFEWeights + synthdid extender Two new wrapper-as-extender entry points have been added to R/did.R to close the Phase 1.e gap. Both follow the requireNamespace-guarded hard-error pattern of the existing DiD wrappers and ship under the canonical morie_did_* namespace: - morie_did_twoway_fe_weights(panel, group, time, treatment, outcome, type = "feTR", ...) -- thin interface to TwoWayFEWeights::twowayfeweights (de Chaisemartin & D'Haultfoeuille, 2020). Returns an morie_did_twfe_diagnostics S3 list with the negative-weight count, the sum of weights, and the full twowayfeweights object as $raw. Complements morie_did_panel_fe (which estimates the TWFE coefficient itself) and morie_did_chaisemartin_dhaultfoeuille (which uses the same identification argument to deliver the DID-M estimator). - morie_did_synthdid_estimate(panel, unit, time, treatment, outcome, vcov_method = "placebo", ...) -- thin interface to synthdid::synthdid_estimate (Arkhangelsky et al., 2021) under the synthdid-canonical name. Parallel to the existing morie_did_synthetic, which keeps the rmorie result-list shape; this extender surfaces the full synthdid object and the requested variance estimator (placebo, bootstrap, jackknife). DESCRIPTION: adds TwoWayFEWeights to Suggests. synthdid was already listed in 0.9.5.12. Tests: tests/testthat/test-did-extender.R covers both the happy path (skipped when the optional package is not installed) and the missing-package error path. Phase 1.g sensitivity rewrite - extend EValue / tipr / sensemakr / konfound The sensitivity subsystem (R/sensitivity.R, ~733 LOC) keeps its existing inline math as a fallback arm but cross-references the canonical CRAN packages (rbounds, tipr, sensemakr, specr, episensr) in the Rd files, and four new wrapper-as-extender entry points have been added so MRM / paper callers can reach the full surface of these packages from inside rmorie: - morie_sensitivity_evalue(estimate, se, sd, type = "OLS", ...) -- thin interface to the EValue::evalues.* dispatch family (evalues.OLS, evalues.RR, evalues.OR, evalues.HR, evalues.MD). Pairs with the typed e_value_rr/_or/_hr/_d wrappers, which call the same backend but with a fixed scale. - morie_sensitivity_tipping_point(estimate, smd, r2, ...) -- thin interface to tipr::tip for unmeasured-confounder tipping points. Pairs with tipping_point_analysis (which targets missing-data sensitivity rather than unmeasured confounders). - morie_sensitivity_omitted_var_bias(model, treatment, benchmark_covariates, ...) -- thin interface to sensemakr::sensemakr on a fitted lm. Pairs with omitted_variable_bias (closed-form Cinelli-Hazlett robustness value when only estimate + se + dof are available). - morie_sensitivity_konfound(estimate, se, n, n_covariates, ...) -- thin interface to konfound::pkonfound for the Frank et al. (2013) percent-bias-to-invalidate and impact-threshold-of-a- confounding-variable (ITCV). All four hard-error with a clear install.packages(...) message if the optional dependency is missing, matching the Phase 1.e / 1.f pattern. rosenbaum_bounds, tipping_point_analysis, omitted_variable_bias, specification_curve, and probabilistic_bias_analysis are kept in-house (the rmorie result shapes are part of the public API and the inline math remains the reference fallback). manski_bounds, bias_adjusted_estimate, and sensitivity_summary are novel / aggregator code with no clean CRAN counterpart. DESCRIPTION: adds tipr, sensemakr, konfound to Suggests. Tests: tests/testthat/test-sensitivity.R extended with eight new test_that() blocks (one happy path + one missing-package error path per new extender, all gated by skip_if_not_installed). R/causal.R rewrite + new causal extenders The causal-inference subsystem (~876 LOC) has been thin-wrapped over the canonical CRAN causal-inference packages while preserving every inline math fallback for CRAN-only installs (dual-arm pattern). - morie_estimate_propensity_scores() delegates to WeightIt::weightit(method = "glm", estimand = "ATE") when available; falls back to stats::glm(family = binomial()). - morie_estimate_ate() / morie_estimate_att() / morie_estimate_atc() inherit the WeightIt delegation through the propensity-score helper; Hajek difference and influence-function SE remain inline to preserve the closed-form result shape. - morie_estimate_aipw() keeps the AIPW score inline; the propensity arm picks up WeightIt automatically. Rd cross-references AIPW::AIPW as the canonical SuperLearner-based alternative. - morie_estimate_g_computation() delegates the standardisation step to stdReg::stdGlm() when available; falls back to the inline glm() + treatment-flipped counterfactual contrast. - morie_estimate_late() already used ivreg::ivreg(); now also recognises AER::ivreg() as a second delegation arm, falling back to manual 2SLS otherwise. - morie_estimate_double_ml() / morie_estimate_irm() continue to delegate to DoubleML::DoubleMLPLR / DoubleMLIRM when available; fall back to the cross-fit ridge implementation. - morie_e_value() delegates to EValue::evalue() when available; falls back to the inline closed-form RR -> E-value formula. - morie_sensitivity_rosenbaum() delegates to rbounds::psens() when available; falls back to the inline sign-score bound. Four new extender functions are introduced for CRAN dependencies that previously had no morie_* entry point: - morie_causal_impact(data, pre_period, post_period, model_args) -> CausalImpact::CausalImpact() (Brodersen et al. 2015 Bayesian structural time-series intervention analysis). - morie_causal_weighting(data, treatment, covariates, method, estimand, ...) -> WeightIt::weightit() (full method palette: glm / cbps / ebal / ps / energy / optweight). - morie_causal_robust_se(model, type, cluster, ...) -> sandwich::vcovHC() / vcovHAC() / vcovCL() (HC0-HC5, HAC, one-way cluster-robust variance). - morie_causal_mediation(y, d, m, x, trim, boot, ...) -> causalweight::medweight() (semiparametric IPW direct / indirect effect decomposition). The four extenders hard-error on missing packages (no inline fallback) since each upstream implementation is too large to re-implement compactly. New tests live in tests/testthat/test-causal-extenders.R and are guarded with skip_if_not_installed(). DESCRIPTION Suggests now lists AIPW, CausalImpact, causalweight, rbounds, sensitivitymv, and stdReg so the extenders and the delegation arms can find their upstream packages when available. R/bootstrap_methods.R rewrite - delegate to boot / bootstrap / resample / rsample / simpleboot / coin / ipred / sandwich The bootstrap / resampling subsystem (~867 LOC) has been re-routed through the canonical CRAN packages while preserving the rmorie API and the morie_bootstrap_result / morie_jackknife_result / morie_permutation_test_result / morie_cv_result S3 return shapes so that stat_commands, the print methods, and MRM analyses keep working unchanged. Thin-wrapped (with requireNamespace-guarded delegation arm + inline fallback so the wrapper keeps working on minimal installs): - bootstrap() -- nonparametric bootstrap now delegates to boot::boot (stratification via strata=) and boot::boot.ci for percentile / basic / normal / bca CIs. The studentized CI path and the cluster resampling path remain inline because boot::boot.ci(type = "stud") requires a precomputed variance estimator and boot::boot does not expose cluster-of-clusters resampling for the rmorie statistic(data) -> scalar signature. - parametric_bootstrap() -- delegates to boot::boot(sim = "parametric") with an MLE / ran.gen pair built from the requested distribution. - block_bootstrap() -- moving / stationary blocks delegate to boot::tsboot(sim = "fixed" / "geom"); the circular-block path stays inline because boot::tsboot does not expose a circular sim mode. - jackknife() -- delegates to bootstrap::jackknife when installed (rmorie reconstructs pseudovalues / influence values around the returned jack.values). - permutation_test() / paired_permutation_test() -- inline shuffle loops retained (the rmorie API returns the full null distribution which downstream MRM code consumes); coin's oneway_test(distribution = "approximate") and symmetry_test(distribution = "approximate") are cross-referenced as the canonical CRAN equivalents. - bootstrap_632() -- inline .632 / .632+ retained because the rmorie API takes naked model_fn / score_fn callables; ipred::errorest(estimator = "632plus") is cross-referenced. - repeated_cv() / leave_one_out_cv() -- delegate to rsample::vfold_cv for fold construction when no stratification or grouping is requested; caret::trainControl and rsample::loo_cv cross-referenced. Kept as in-house implementations (no clean CRAN drop-in for the rmorie API): - subsampling() -- Politis-Romano-Wolf rate scaling over a user-supplied statistic(data); np::npsubsample is closest but is kernel-specific. - delete_d_jackknife() -- generalised jackknife with custom max_subsets enumeration cap; resample::jackknife is cross-referenced. - wild_bootstrap() -- returns the full resampled coefficient distribution; sandwich::vcovBS (variance only) and fwildclusterboot::boottest (p-value only) cross-referenced. Added four new extender entry points (thin pass-through to the canonical packages): - morie_boot_run(data, statistic, R, strata, ...) -- direct boot::boot bridge with rmorie-style statistic(data) -> scalar adapter. - morie_boot_basic_ci(boot_obj, type, conf) -- direct boot::boot.ci bridge returning tidy list(perc = c(lo, hi), bca = c(lo, hi), ...). - morie_rsample_bootstraps(data, times, ...) -- direct rsample::bootstraps bridge returning an rset. - morie_simpleboot_two(x, y, statistic, R, ...) -- direct simpleboot::two.boot bridge for two-sample bootstrap of a scalar statistic. DESCRIPTION: adds boot, bootstrap, coin, ipred, resample, rsample, simpleboot to Suggests (sandwich was already present). Function inventory is preserved: 11 prior exports kept (bootstrap, parametric_bootstrap, wild_bootstrap, block_bootstrap, jackknife, delete_d_jackknife, permutation_test, paired_permutation_test, subsampling, bootstrap_632, repeated_cv, leave_one_out_cv) plus 4 new (morie_boot_run, morie_boot_basic_ci, morie_rsample_bootstraps, morie_simpleboot_two). All 20 (15 prior + 5 new) tests/testthat/test-bootstrap_methods.R test_that blocks pass (73 expectations, 2 conditional skips when rsample or simpleboot is not installed). R/effects.R rewrite - thin-wrap over emmeans / marginaleffects / broom / stdReg / rbounds / EValue (phase 1.j) The treatment-effect / marginal-effects module has been thin-wrapped around its canonical CRAN extender packages and gained a new family of morie_effects_* wrappers over Vincent Arel-Bundock's marginaleffects API plus the emmeans and broom ecosystems. Thin-wrapped (existing API preserved; requireNamespace-guarded delegation arm + inline fallback so the wrapper keeps working on CRAN-only installs): - estimate_ate() -- guarded HC3 SE via sandwich::vcovHC + lmtest::coeftest; falls back to naive model SE with a warning. - estimate_plr(), estimate_pliv() -- already DoubleML-aware; retained as is with style cleanups (snake-case locals, stats::qnorm / stats::pnorm qualification, brace style). - estimate_ate_gcomputation() -- delegates to stdReg::stdGlm() for the canonical regression-standardisation backend (analytic SE on the contrast scale, no bootstrap needed); falls back to the legacy inline bootstrap implementation when stdReg is absent. - sensitivity_rosenbaum() -- delegates to rbounds::psens() when rbounds is installed and the matched-pair count is sufficient; falls back to the inline normal-approximation Wilcoxon signed-rank bounds otherwise. - e_value() -- delegates to EValue::evalues.OLS() when both EValue is installed AND the caller supplies an outcome standard deviation via the new sd_y argument; falls back to the closed-form continuous-scale RR proxy so the Python and R ports stay numerically aligned by default. New extender wrappers (each a thin pass-through; the underlying package's native object is returned verbatim so downstream code keeps working with the canonical API): - morie_effects_emmeans(model, specs, ...) -> emmeans::emmeans(). Use emmeans::pairs() / emmeans::contrast() on the returned emmGrid for pairwise or custom contrasts. - morie_effects_predictions(model, newdata, ...) -> marginaleffects::predictions(). - morie_effects_comparisons(model, variables, ...) -> marginaleffects::comparisons(). - morie_effects_slopes(model, variables, ...) -> marginaleffects::slopes(). - morie_effects_tidy(model, ...) -> broom::tidy() (with a summary()-based fallback frame for the lm / glm classes rmorie's MRM pipeline ships, so downstream tidy callers work even on minimal-Suggests CI runs). The effects and margins packages are added to Suggests for the cross-reference path -- users who want Fox's effect() / predictorEffects() or Leeper's Stata-style margins::margins() can call them directly on a model fitted via rmorie, without an intervening wrapper. DESCRIPTION: adds broom, effects, emmeans, marginaleffects, margins, performance, rbounds, stdReg to Suggests. Net: R/effects.R grows from 461 to ~640 LOC because the legacy treatment-effect functions now carry both a CRAN-delegation arm AND an inline fallback (CRAN policy does not allow Suggests to be hard required), plus 5 new extender wrappers + 1 shared helper. All 6 legacy exports preserved (estimate_ate, estimate_plr, estimate_pliv, estimate_ate_gcomputation, sensitivity_rosenbaum, e_value); 5 new exports added (morie_effects_emmeans, morie_effects_predictions, morie_effects_comparisons, morie_effects_slopes, morie_effects_tidy). R/multiple_testing.R rewrite - delegate to poolr / qvalue / harmonicmeanp / gMCP / mutoss The multiple-testing-correction subsystem (~916 LOC) has been rewritten to forward to the canonical CRAN / Bioconductor packages where one exists. Every wrapper preserves the rmorie API and the morie_multiple_testing_result / morie_rich_result S3 shape so that the stat_commands dispatcher, the print.morie_multiple_testing_result method, and MRM analyses keep working unchanged. Thin-wrapped (with requireNamespace-guarded delegation arm + inline fallback so the wrapper keeps working on CRAN-only installs): - bonferroni(), holm(), hochberg(), hommel(), benjamini_hochberg(), benjamini_yekutieli() -- already stats::p.adjust wrappers, retained as-is. - sidak(), holm_sidak() -- inline closed-form math; mutoss cross-referenced in Rd as the canonical step-down equivalent. - storey_q() -- delegates to qvalue::qvalue (Bioconductor) when installed; falls back to the inline Storey cutoff otherwise. - estimate_pi0() -- delegates to qvalue::pi0est for storey / bootstrap methods when installed; falls back to inline estimators. - fisher_combined() -- delegates to poolr::fisher when installed. - stouffer_combined() -- delegates to poolr::stouffer when installed and no weights are supplied (weighted Stouffer stays inline because poolr does not take per-test weights). - tippett_combined() -- delegates to poolr::tippett when installed. - harmonic_mean_p() -- delegates to harmonicmeanp::p.hmp when installed. - n_effective_tests() -- delegates to poolr::meff (Galwey / Li-Ji / Nyholt) when installed. - fixed_sequence(), fallback_procedure() -- gMCP cross- referenced in Rd as the canonical graphical-MCP equivalent; inline implementations retained to preserve the rmorie stage-list return shape. Kept as in-house implementations (no clean CRAN drop-in for the rmorie API): - cauchy_combination() -- Liu and Xie 2020; ACAT is GitHub-only. - hierarchical_bonferroni() -- rmorie-specific stage-list return shape; gMCP covers the concept with a different graphical API. - local_fdr() -- Efron empirical-Bayes KDE shape that locfdr does not match in return structure. - permutation_fwer(), permutation_fdr() -- step-down max-T and empirical-null-p FDR over user-supplied null matrices; no CRAN function exposes the same API. - adjust_p_values() -- front-end dispatcher across the rmorie wrappers; consumed by stat_commands. DESCRIPTION: adds poolr, qvalue, gMCP, harmonicmeanp, multcomp, mutoss to Suggests. Net: R/multiple_testing.R grows from 916 to 1118 LOC because each thin-wrap function now carries both a CRAN-delegation arm AND an inline fallback (CRAN policy does not allow Suggests to be hard required, and qvalue is Bioconductor only). Function inventory is unchanged: 25 exports preserved (bonferroni, sidak, holm, hochberg, hommel, holm_sidak, benjamini_hochberg, bh, benjamini_yekutieli, by_fdr, storey_q, fisher_combined, stouffer_combined, tippett_combined, simes_combined, harmonic_mean_p, cauchy_combination, fixed_sequence, fallback_procedure, hierarchical_bonferroni, estimate_pi0, adjust_p_values, n_effective_tests, local_fdr, permutation_fwer, permutation_fdr, plus the print.morie_multiple_testing_result S3 method). All 85 tests/testthat/test-multiple_testing.R assertions pass on a fallback-only install. R/did.R rewrite - delegate to did / DRDID / fixest / HonestDiD / bacondecomp / DIDmultiplegt The DiD subsystem (~1,719 LOC) has been rewritten to forward to the canonical CRAN packages instead of carrying ~700 LOC of base-R fallback code: - morie_did_panel_fe() is now a thin wrapper over fixest::feols(y ~ d | unit + time) with cluster-robust SE and hard-errors if fixest is not installed (the base-R two-way within-transform fallback has been removed). - morie_did_event_study() is a thin wrapper over fixest::feols with fixest::i(rel_time, ref = -1) relative-time dummies plus unit and time fixed effects; hard-errors if fixest is missing. - morie_did_group_time_att() is a thin wrapper over did::att_gt for the Callaway-Sant'Anna group-time ATTs and hard-errors if did is not installed (the base-R bootstrap fallback has been removed). - morie_did_doubly_robust() is a thin wrapper over DRDID::drdid_rc (Sant'Anna-Zhao 2020 repeated-cross-section doubly-robust DiD) and hard-errors if DRDID is missing (the hand-written GBM-or-logistic + linear-or-GBM bootstrap fallback has been removed). The ps_model / or_model arguments are retained for back-compat but ignored; DRDID uses logistic propensity and linear outcome regression internally. - morie_did_bacon_decomposition() is a thin wrapper over bacondecomp::bacon and hard-errors if bacondecomp is not installed (the base-R timing-pair enumeration fallback has been removed). - morie_did_chaisemartin_dhaultfoeuille() is a thin wrapper over DIDmultiplegt::did_multiplegt and hard-errors if DIDmultiplegt is not installed (the base-R switcher-comparison bootstrap fallback has been removed). - morie_did_synthetic() continues to delegate to synthdid::synthdid_estimate (no change; synthdid is the only R implementation). - morie_did_sensitivity_analysis() keeps its delta-bound CI sweep but the Rd now cross-references HonestDiD::createSensitivityResults_relativeMagnitudes as the reference implementation of Rambachan-Roth 2023 for event-study estimates. The OLS-based wrappers (morie_did_2x2, morie_did_repeated_cross_section, morie_did_triple_difference, morie_did_continuous_treatment, morie_did_fuzzy) continue to use the in-package .morie_did_ols_robust_se helper because the specs are simple OLS / 2SLS regressions and a CRAN dependency for trivially-short OLS would be a regression. The same helper is reused by morie_did_wild_cluster_bootstrap, which remains base-R by design (fwildclusterboot is GitHub-only; see 0.9.5.12 NEWS). Aggregators that consume DiD output and produce rmorie-specific tables (morie_did_aggregate_gt_att, morie_did_staggered, morie_did_parallel_trends_data, morie_did_test_parallel_trends, morie_did_placebo_test_*, morie_did_heterogeneous, morie_did_diagnostics) are unchanged. Net: R/did.R shrinks from 1,719 to 1,463 LOC (-256, -15%); all 22 morie_did_* exports preserved; result-list shape is unchanged so downstream callers see the same fields. DESCRIPTION: adds DRDID, HonestDiD, DIDmultiplegt to Suggests (fixest, did, bacondecomp, synthdid were already listed). R/matching.R rewrite - delegate to MatchIt / cobalt / WeightIt The matching subsystem (~2,183 LOC) has been rewritten to forward to the canonical CRAN packages instead of carrying ~950 LOC of base-R fallback code: - morie_matching_nearest_neighbor(), morie_matching_exact(), morie_matching_cem(), morie_matching_mahalanobis(), morie_matching_optimal_pair(), morie_matching_full(), morie_matching_subclassify(), morie_matching_variable_ratio() are now thin wrappers over MatchIt::matchit() and now hard-error if MatchIt (or its optional optmatch back end) is not installed. - morie_matching_genetic() is a thin wrapper over Matching::GenMatch() + Matching::Match() and hard-errors if Matching is missing. - morie_matching_entropy_balance() is a thin wrapper over WeightIt::weightit(method = "ebal") (or ebal::ebalance as a fallback) and hard-errors if neither is installed. - morie_matching_balance(), morie_matching_balance_table(), morie_matching_love_plot_data() are kept (they return the morie_balance_result shape downstream MRM code consumes) but the Rd cross-references now point users at cobalt::bal.tab() / cobalt::love.plot() for richer balance reporting. - morie_matching_cardinality() keeps its iterative-caliper heuristic; the Rd now cross-references designmatch::cardmatch() for the exact mixed-integer-programming alternative. The carceral-domain helpers (morie_matching_att_matched / ate_matched / atc_matched, morie_matching_abadie_imbens_se, morie_matching_rosenbaum_bounds, morie_matching_doubly_robust, morie_matching_multi_treatment, morie_matching_longitudinal, morie_matching_quality, morie_matching_overlap, morie_matching_estimate_propensity / _trim_propensity / _common_support) are unchanged - they encode rmorie-specific output shapes (morie_match_result, morie_te_result) that the MRM / SIU / OTIS code paths depend on. Net: R/matching.R shrinks from 2,183 to 1,586 LOC (-597, -28%); all 27 morie_matching_* exports preserved; behaviour-compatible for callers that already have MatchIt installed (which is the case for all matching tests in the rmorie suite). DESCRIPTION: adds cobalt, designmatch to Suggests. Breaking - CRAN-equivalent functions removed To reduce code duplication with established CRAN packages and address rOpenSci feedback on fn_call_network_size, the following functions have been removed in favour of their well-maintained CRAN equivalents: | Removed from rmorie | Use instead | |---|---| | cohens_d / morie_cohens_d | effectsize::cohens_d | | cramers_v / morie_cramers_v | effectsize::cramers_v | | eta_squared / morie_eta_squared | effectsize::eta_squared | | hedges_g / morie_hedges_g | effectsize::hedges_g | | morie_effective_sample_size | posterior::ess_basic or coda::effectiveSize | | morie_find_project_root | here::here() or rprojroot::find_root() | | fleiss_kappa | irr::kappam.fleiss | | kruskal_wallis | stats::kruskal.test (base R) | | shapiro_wilk | stats::shapiro.test (base R) | | anderson_darling | nortest::ad.test | | jarque_bera | tseries::jarque.bera.test | Install the replacements with install.packages(c("effectsize", "irr", "nortest", "tseries", "here")). morie_two_sample_t_test() / morie_chi_square_test() / morie_anova_one_way() now return list fields named cohens_d / cramers_v / eta_squared (was morie_cohens_d / morie_cramers_v / morie_eta_squared); the computation is inlined and unchanged. Internal callers of morie_find_project_root() now go through a private .morie_project_root() wrapper around here::here(). DESCRIPTION: adds here to Imports; adds effectsize, irr, tseries to Suggests. Changes in version 0.1.2 - Initial CRAN submission. - Twelve new R wrappers bring the curated public API to functional parity with the Python sibling: calculate_ebac(), is_over_legal_limit(), calculate_ipw_weights(), estimate_irm() (DoubleML wrapper), infer_measurement_level(), profile_dataset(), suggest_analysis_plan(), compare_nested_logistic_models(), run_treatment_effects_analysis(), run_weighted_logistic_analysis(), inspect_output(), verify_statistical_output(). Changes in version 0.1.0-4 - 99 exported functions across causal inference (ATE/ATT/ATC/GATE/CATE/LATE, AIPW, G-computation, IRM via DoubleML, IPW, AIPW, Rosenbaum bounds, E-value), survey sampling (stratified/cluster/PPS/bootstrap/jackknife, calibration weights, design effects), psychometric and effect-size helpers (Cohen's d, Hedges' g, η², ω², Cramér's V, Kendall's τ, Spearman's ρ), classical statistical tests (one-/two-sample/paired t, Wilcoxon, Mann-Whitney, Kruskal-Wallis, Levene, Shapiro-Wilk, χ², Fisher exact), confidence intervals (risk-difference, risk-ratio, odds-ratio, proportion), power and sample-size (morie_power_t_test, morie_power_prop_test, sample_size_logistic), signal-processing primitives (Butterworth filters, Higuchi fractal dimension, Hurst exponent), dataset profiling, OTIS correctional-data analysis, and the MRM (McNamara-Ruhela-Medina) framework. - Python parity: this package is the R sibling of the Python morie package on PyPI. Both expose the same conceptual public API; each uses its native language's idioms and ML ecosystem (R: mlr3 + DoubleML; Python: scikit-learn + DoubleML). - estimate_irm() is a thin R wrapper around DoubleML::DoubleMLIRM from the CRAN DoubleML package; DoubleML, mlr3, and mlr3learners are in Suggests and the function gates them with requireNamespace().