Models as Prediction Machines: How to Convert Confusing Coefficients Into Clear Quantities Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models but is challenging for more complex models with, for example, categorical variables, interactions, nonlinearities, or hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation and to treat models as “counterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in consistent fashion to draw inferences from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports more than 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational studies, randomized experiments) to answer different research questions (e.g., about associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modeling ordinal outcomes, and interpreting nonlinear models).
Flowchart of a modeling workflow: first, select and fit a model suited to the research question; next, compute estimands (predictions, comparisons, slopes) with choices about unit-level vs. average estimates and scale; finally, perform statistical testing (uncertainty, null and equivalence tests). A side note contrasts this with a standard workflow that focuses directly on model coefficients.
I really enjoyed writing this one because marginaleffects just makes so much sense! It's a bit like with causal graphs -- a great tool to make sense of stuff, so why not teach it early on to make life a bit easier?