In the last month, Michael Scharkow and I have worked on a new R-package called
specr. The goal was to facilitate specification curve analyses (also called multiverse analyses). The idea behind a specification curve analysis stems from the observation that a researchers has many degrees of freedom when conducting a quantitative analysis of a data set.
It starts with the question of how to operationalize the dependent and the independent variable, what model to estimate, which control variables to include, what transformation to conduct and ends with which cases to exclude or include (obviously there are even many more researcher’s degrees of freedom!). The problem is that the options in many of these decisions are somewhat similarly defensible. For example, we often have good arguments for both excluding and including a certain variable when estimating a latent variable. Or we might find good reasons for both excluding as well as including outliers. According to Gelman and Loken1, researchers literally find themselves in a “garden of forking paths” in which a multitude of possibilities could yield a multiverse of results (see Fig 1).
Fig 1. A visualization of the garden of forking paths.
The proposed solution to this problem is to run all theoretically defensible combinations of analytical choices (specifications) in order to obtain all possible results from the data. Simonsohn, Simmons, and Nelson2 thus proposed the specification curve analysis: a statistical approach that aims at running all reasonable specifications and investigating all obtained results in using a curve plot. So far, only a few papers have used such an approach3, revealing that in some research areas, almost any results is obtainable even within the limits of just one data sets.
The idea behind
So far, most researchers used loop-based functions in R to estimate all the models. They set up a results frame which gets filled with the results from each specification through a sequential analysis of all models. Although this makes a lot of sense, it has some limitations such as slow computation (due to the sequential nature) or a somewhat complex coding procedure.
specr, we wanted to create convenient functions that standardize this procedure and allow for faster computation. Although not implement yet, this approach will allow to use parallelization of the model estimations. We are currently investigating the implementation of the great package
furrr (check our github for updates on this!). In what follows, I will exemplify the use of
specr using a simple simulated data set (taken from the documentation of
For more information about the various functions , visit the package documentation. You can find comprehensive examples and vignettes on how to use the various functions of the package on the following pages:
- Getting started: A comprehensive example. This vignette illustrates the major functions of the package.
- Customizing specification curve plots: This vignette exemplifies various ways to plot the specification curve.
- Decomposing the variance of the specification curve: An example of how to investigate variance components of the specification curve.
- Visualizing progress during estimation: This vignette explains how to create a customizable progress bar for longer computations.
A simple example
How to install the R-package
You can install the package from CRAN like this:
You can also install the most recent development version from the respective GitHub repository with:
# install.packages("devtools") devtools::install_github("masurp/specr")
Check the data
In order to understand what type of analytical choices exists, we need to understand our data set.
# Load library library(specr) # The example data included in the package head(example_data) ## x1 x2 c1 c2 y1 y2 group1 ## 1 1.533913 1.3697122 0.5424902 3.8924435 23.500543 10.4269278 0 ## 2 1.680639 1.5163745 -1.2415868 3.3377268 17.017955 0.5733467 1 ## 3 1.223941 -0.2381044 0.1405891 0.8911959 -3.678272 4.2303190 0 ## 4 1.765276 0.9524049 4.0397943 1.8567454 21.668684 14.8865252 1 ## 5 1.907134 0.6282816 3.1002518 5.5840574 32.713106 20.5251920 0 ## 6 1.710695 1.1898467 0.4648824 4.0239483 20.422171 4.3471236 1 ## group2 ## 1 A ## 2 C ## 3 B ## 4 B ## 5 A ## 6 C
In this example, we assume that
x refers to independent variables,
y represents dependent variables,
c refers to control variables, and
group denotes potential grouping variables.
Define analytical choices and run the analyses
The main function of the package is
run_specs(). We need to provide the data set, and all analytical choices that we deem resonable. In this cases, we investigate the influence of different dependent and independent variables, various control variables, as well as how subsets of
group1 affect the results.
# Run specification curve analysis results <- run_specs(df = example_data, y = c("y1", "y2"), x = c("x1", "x2"), model = c("lm"), controls = c("c1", "c2"), subsets = list(group1 = unique(example_data$group1))) # Check results ## # A tibble: 48 x 12 ## x y model controls estimate std.error statistic p.value ## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> ## 1 x1 y1 lm c1 + c2 4.95 0.525 9.43 3.11e-18 ## 2 x2 y1 lm c1 + c2 6.83 0.321 21.3 1.20e-57 ## 3 x1 y2 lm c1 + c2 -0.227 0.373 -0.607 5.44e- 1 ## 4 x2 y2 lm c1 + c2 0.985 0.324 3.04 2.62e- 3 ## 5 x1 y1 lm c1 5.53 0.794 6.97 2.95e-11 ## 6 x2 y1 lm c1 8.07 0.557 14.5 6.90e-35 ## 7 x1 y2 lm c1 0.0461 0.466 0.0989 9.21e- 1 ## 8 x2 y2 lm c1 1.61 0.394 4.10 5.72e- 5 ## 9 x1 y1 lm c2 5.15 0.625 8.24 9.95e-15 ## 10 x2 y1 lm c2 6.50 0.466 13.9 5.38e-33 ## # … with 38 more rows, and 4 more variables: conf.low <dbl>, ## # conf.high <dbl>, obs <int>, subsets <chr>
The resulting data frame includes relevant statistics for each of the estiamted models and the number of observations that have been used in each specification.
We can plot a simple visualization of the garden of forking paths to understand how our analytical choices led to the 48 specifications.
plot_decisiontree(results, legend = TRUE, label = T)
Fig 2. How the analytical choices lead to a large number of specifications.
Investigate the specifications
The package includes a simple function
summarise_specs() that allows to get a first summary of the results.
# basic summary of the entire specification curve summarise_specs(results) ## # A tibble: 1 x 7 ## median mad min max q25 q75 obs ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 3.55 4.73 -0.227 9.03 0.727 7.62 250 # summary by specific groups and statistics summarise_specs(results, stats = lst(mean, median, min, max), group = c("x", "y")) ## # A tibble: 4 x 7 ## # Groups: x  ## x y mean median min max obs ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 x1 y1 6.63 6.52 4.95 8.27 250 ## 2 x1 y2 0.237 0.157 -0.227 0.727 250 ## 3 x2 y1 7.80 7.84 6.50 9.03 250 ## 4 x2 y2 1.36 1.34 0.653 2.14 250
However, in order to grasp how the different analytical choices affect the outcome of interest (in this case, the estimate refers to the unstandardized regression coefficient b), most researchers plot a specification curve. The function
plot_specs() produces such a typical visualization.
# Plot specification curve analysis plot_specs(results)
Fig 3. Visualization of the specification curve analyis.
Here, we can clearly see that the different dependent variables exert a big influence on the obtained regression coefficient.
We do see a lot of value in investigating how analytical choices affect a statistical outcome of interest. However, we strongly caution against using
specr as a tool to somehow arrive at a better estimate. Running a specification curve analysis does not make findings any more reliable, valid or generalizable than a single analyis. The method is only meant to inform about the effects of analytical choices on results, and not a better way to estimate a correlation or effect.
We hope that our package can help in this process. If you find any bugs or would like to propose changes, please do so via the respective Github page: https://github.com/masurp/specr/. We are looking forward to feedback and comments.
- Gelman, A. & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf↩︎
- Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2019). Specification Curve: Descriptive and Inferential Statistics for all Plausible Specifications Available at: http://urisohn.com/sohn_files/wp/wordpress/wp-content/uploads/Paper-Specification-curve-2019-11-16.pdf↩︎
- E.g., Steegen, S. Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing Transparency Through a Multiverse Analysis. Perspectives on Psychological Science, 11(5), 702-712. https://doi.org/10.1177/1745691616658637; Rohrer, J., Egloff, B., Schmuckle, C. (2017). Probing Birth-Order Effects on Narrow Traits Using Specification-Curve Analysis. Psychological Science, 28(12), 1821–1832. https://doi.org/10.1177/0956797617723726; Orben, A., Przybylski, A. K. (2019). The association between adolescent well-being and digital technology use. Nature Human Behaviour, https://doi.org/10.1038/s41562-018-0506-1; Orben, A., Dienlin, T., & Przybylski, A. K. (2019). Social media’s enduring effect on adolescent life satisfaction. Proceedings of the National Academy of Sciences of the United States of America. https:///doi.org/10.1073/pnas.1902058116↩︎