Jack Fitzgerald (@jackfitzgerald) Bsky

Imputations, inverse hyperbolic sines and impossible values - Nature Human Behaviour Nature Human Behaviour - Imputations, inverse hyperbolic sines and impossible values

My NHB paper is literally about an article where problems with IHS transformations revealed data irregularities that (partly) resulted in that article's retraction. That NHB paper doesn't take any stance on IHS specifications, let alone a *pro* stance. 2/

doi.org/10.1038/s415...

4 weeks ago 1 0 0 0

Given my stance on log-like specifications, I was surprised to learn that there's a 'news' article on my paper in Nature Human Behaviour, claiming that it actually advocates for the use of IHS specifications. This is categorically untrue. 1/

t.co/ZGjAmHZhLv

4 weeks ago 2 0 2 0

OSF

But for full details, nothing will beat reading the paper. Give it a look! 18/
doi.org/10.31222/osf...

1 month ago 1 0 0 0

For the on-the-go researchers out there, we’ve made teaching slides to make the paper’s findings more digestible. 17/
jack-fitzgerald.github.io/files/Log-Li...

1 month ago 1 0 1 0

Huge shoutout to the rest of the research team who made this possible: @jopieboy.bsky.social, @fialalenka.bsky.social @essieconomist.bsky.social, and @davidvalenta.bsky.social. 16/

1 month ago 0 0 1 0

We have a couple of recommendations on how to deal with the logs-with-zeros problem in the paper. But our biggest advice is this: 🛑stop🛑 using log-like specifications. They are actively polluting the literature with spuriously significant results. 15/

1 month ago 2 0 1 0

Consequently, we find that log-like specifications in our replication sample are statistically significant 40-49% more frequently than in the general causal economics literature, and published test statistics are *really* likely to be just beyond 5% significance thresholds. 14/

1 month ago 1 0 1 0

You don’t need p-hacking for this to cause problems. If either researchers file-drawer statistically insignificant results, or journals select statistically significant results, the most spuriously significant log-like specifications can be overrepresented in the literature. 13/

1 month ago 1 0 1 0

This happens because messing with unit scale (or the c in ln(Z+c)) allows you to overfit the data. In sample-split simulation data, the log-like specifications that yield the most spuriously significant results within-sample have the worst out-of-sample predictability. 12/

1 month ago 1 0 1 0

We show this in simulation evidence: even with a placebo treatment and an outcome made of random noise, you get a >30% increase in rejection rates by mining over unit scalings. We also observe sweet spots in ~21% of our simulation draws. 11/

1 month ago 1 1 1 0

This creates a multiple hypothesis testing problem. There’s no ‘right/wrong’ scale in which to measure a variable and no ‘right/wrong’ constant c to add to ln(Z+c). So you get an infinite number of tests that are equally theoretically valid, but most give different results. 10/

1 month ago 0 0 1 0

We also discovered that in ln(Z+c) specifications, you can get sweet spots both in unit scale and in constant c. 9/

1 month ago 0 0 1 0

We discovered that t-statistics in log-like specifications can be non-monotonic in unit scale, creating local optima in t-statistics that can briefly dip into rejection regions. This doesn’t just matter for point estimates: it affects studies’ entire conclusions. 8/

1 month ago 1 0 1 0

Two of our robustness checks involved scaling variables up or down by a factor of 1000 before transformation. For 38% of estimates, *both* of these checks shrunk t-statistics. This pointed us to the existence of what we call ‘sweet spots’. 7/

1 month ago 2 0 1 0

These specifications are *really* non-robust. Just removing the log-like transformation changes 36% of conclusions and significantly sign-flips 12% of estimates. Other checks change conclusions for 14-36% of estimates. 6/

1 month ago 2 0 1 0

We re-analyzed replication data from 46 papers whose main findings are defended by log-like specifications. Using ceteris paribus robustness checks that change one design choice at a time, we find widespread non-robustness and publication bias in these specifications. 5/

1 month ago 1 0 1 0

Validate User

Chen & @jondr44.bsky.social (2024, QJE) show that you can get coefficients of any magnitude you want by adjusting the scale of transformed variables before transformation. (Semi-)elasticities and percentage effects should never have this property. 4/
doi.org/10.1093/qje/...

1 month ago 0 1 1 0

Many recent papers highlight identification problems that arise because these specifications’ results depend on the unit scale of transformed variables. So e.g., regressions on ln(dollars + 1) will give you different coefficients and t-statistics than ln(cents + 1). 3/

1 month ago 0 1 1 0

If you have 0s in your data, you can’t run log specifications w/o drops because ln(0) is undefined. So many researchers replace the log transformation with the ln(Z+1) or inverse hyperbolic sine (IHS) transformations, which look like ln(Z) for large Z but are defined at 0. 2/

1 month ago 0 0 1 0

New preprint! We reanalyze 46 papers that use log-like specifications (ln(Z+1), inverse hyperbolic sine etc). We find widespread non-robustness, and we show through theory + simulation how these models drive spurious significance. 1/

doi.org/10.31222/osf...

1 month ago 12 3 1 2

We're thrilled to open registration for the Utrecht Replication Games. The event will be at the at the University of Utrecht on June 4th. Psych, public health, pol sci and econ studies will be reproduced!

Register here: www.surveymonkey.ca/r/Replicatio...

1 month ago 26 17 0 1

When your spam targets won't submit so you *demand submission*

Wishing y'all luck today

1 month ago 3 0 1 0

Had a wonderful time organizing the scientific side of the CBS Replication Games! Thank you to the replicators for your hard work!

2 months ago 6 1 1 0

Client Challenge

For those without institutional access to NHB, Nature has provided the following link, from which you can access my Matters Arising for free: rdcu.be/eYab2

27/27

3 months ago 3 0 1 0

I want to thank the editors of @nathumbehav.nature.com for taking this matter seriously and keeping in touch with me over the past 21 months. Their commitment to open science made this correction possible. 26/x

3 months ago 1 0 1 0

In addition to the lives at stake, governments spend hundreds of billions each year on counterterrorism. To determine what policies best deter terrorism is to answer a trillion-dollar question. Unfortunately, this study categorically cannot answer that question. 25/x

3 months ago 1 0 1 0

For example: arrest rates are computed as (arrests/attacks), but countries can have more arrests than attacks. In most country-years, arrest rates are either >100% or a positive number divided by 0. Belgium apparently had a terrorism arrest rate of 16,600% in 2018. 24/x

3 months ago 1 0 1 0

I don’t have space in the 1200-word limit of a Matters Arising to cover everything wrong I found with this paper. The more you look, the more you find. Many ‘minor’ details that would be worth a comment in their own right are relegated to the Supplementary Material. 23/x

3 months ago 1 0 1 0

Line graphs displaying time series of the raw number of terrorist attacks over time in each of 28 EU member states. A red box highlights the time series for the United Kingdom. Based on WEA's data.

The original paper explicitly highlights how much terrorism ‘declined’ in the UK after the COVID-19 pandemic. But the decline after 2020 is only that stark because all terrorism-related variables are imputed to 0, without disclosure, after the UK left the EU in 2020. 22/x

3 months ago 2 1 1 1

This means that the paper’s panel dataset on terrorism (enforcement) in the EU included country-years where the country *wasn’t even yet/anymore part of the EU*, assigning all terrorism-related variables to zero for these country-years without telling anyone. 21/x

3 months ago 1 0 1 0

Posts by Jack Fitzgerald