--- title: "Related software" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Related software} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` Software for error detection goes well beyond scrutiny. This vignette presents broadly similar packages and apps, with no claim to completeness. Please contact me if you know about relevant software that isn't listed here (email: jung-lukas\@gmx.net). The tools are heuristically categorized in terms of how much they have been used in the course of error detection. ## Established methods *Techniques that come up often in discussions of error checking.* - For good reason, [statcheck](https://michelenuijten.shinyapps.io/statcheck-web/) by Sacha Epskamp and Michèle Nuijten is the best-known error detection software. It reconstructs *p*-values and tests them for consistency with their respective statistic, such as *t* or *F*. Even better, it operates on PDF files automatically, enabling users to scan [massive amounts](https://link.springer.com/article/10.3758/s13428-015-0664-2) of published articles. Steve Haroz built a [simple edition](https://steveharoz.shinyapps.io/statchecksimple) of the statcheck web app. - James Heathers' [SPRITE](https://peerj.com/preprints/26968v1/) algorithm reconstructs possible distributions of raw data from summary statistics. For R users, it was implemented in [rsprite2](https://lukaswallrich.github.io/rsprite2/) by Lukas Wallrich, building up on code by Nick Brown. Jordan Anaya developed a [Python-based SPRITE app](http://www.prepubmed.org/sprite/). ## Up and coming *Recent and ongoing development in forensic metascience methods.* - The [unsum](https://lhdjung.github.io/unsum/) package implements CLOSURE, a technique by Nathanael Larigaldie that generalizes SPRITE to find all possible samples, not just a few. It is extremely fast because the core algorithm is written in Rust, which makes CLOSURE suitable even for moderately large samples. - Ian Hussey's raft of checks for summary statistics: - TIDES, an [R package](https://github.com/ianhussey/tides) and a [Shiny app](https://errors.shinyapps.io/TIDES/) for a consistency test of summary statistics of data with a known minimum and maximum (e.g., Likert scales, or where the empirical min and max is known or can be well estimated). TIDES also assesses the magnitude of variability given the scale bounds. - The [ANCHOR app](https://errors.shinyapps.io/ANCHOR/) to check for consistency between the whole sample and its subgroups. - The [PORT app](https://errors.shinyapps.io/PORT/) to test correlation tables for consistency. - The [ellipse of insignificance app](https://drg85.shinyapps.io/EOIROAR/) by David Robert Grimes tests the robustness of dichotomous outcome trials. - [ScrutiPy](https://github.com/nrposner/scrutipy) by Nicolas Roman Posner provides a Python interface to some of scrutiny's functionality. It also features CLOSURE and methods to recalculate confusion matrices. Like unsum, it relies on Rust implementations, so it runs very fast. ## Possibly helpful *These software projects are less well known and have not been widely employed; at least not in forensic metascience. They are listed here because they might have some potential for error checking. However, their forensic utility has not been thoroughly examined.* - The R package [validate](https://cran.r-project.org/package=validate) by Mark P.J. van der Loo provides numerous tools for data checking. - The delta-F test for linearity, a.k.a. the "Förster test", was implemented in Dale J. Barr's R package [forsterUVA](https://github.com/dalejbarr/forsterUVA). - Several R packages leverage the Benford distribution of naturally occurring numbers to assess whether reported numbers are, in fact, natural. These packages include: - [benford.analysis](https://github.com/carloscinelli/benford.analysis) by Carlos Cinelli contains various sophisticated tools for inspecting data using the Benford distribution. - [jfa](https://koenderks.github.io/jfa/index.html) by Koen Derks offers a full statistical auditing suite (including Benford analysis). - [XLTest](http://www.sysmod.com/xltest/index.htm) is a tool for auditing Excel files (but it is not free). - The Rust crate [SeaCanal](https://github.com/saghm/sea-canal#how-does-seacanal-work) analyzes numeric sequences, uncovering patterns of operations that might have generated them. - Emerging from the Pruitt investigations, there is now R software for analyzing sequences: - The package [twopointzerothree](https://github.com/Sorbus-torminalis/twopointzerothree/) (by an anonymous developer) checks data for sequences of perfectly correlated numbers. These numbers are either duplicates of each other or they are duplicates offset by some constant amount; hence the name. - Similarly, the [sequenceSniffer](https://github.com/alrutten/sequenceSniffer) app by Anne Rutten detects repetitions in sequences.