Depends on whether you do things on design end to address each of these limitations, e.g. placebo tests, Solomon design, validation study, etc.
It’s unfair to take a vague checklist approach to deciding whether a study is high quality. We don’t like this for observational studies either.
Posts by Christopher Boyer
That would also tend to violate parallel trends see for instance Audrey rensons paper mentioned elsewhere in thread arxiv.org/pdf/2505.03526
Not for a treatment delivered after t0 which is the canonical DiD design
With the implication that parallel trends imposes parametric restriction on U -> Y(t0) and U -> Y(t1).
Directed acyclic graph showing parallel trends under difference-in-differences design as well as a violation.
Here’s how it’s often represented graphically in Epi.
From journals.lww.com/epidem/fullt...
A PhD candidate at Harvard Nutrition steps forward, head bowed. Walter Willett, dressed in full regalia, solemnly reaches into two fishbowls. One is full of slips of paper with nutrients/foods on them. The other, diseases. The random pair he withdraws is their dissertation topic. *Trumpets sound*
What I want: a handful of rigorous, randomized evaluations of AI use in science with clear protocols of use, careful measurement, and real endpoints.
What I am getting: a million sloppy studies either using AI to crawl massive publication databases or little trials reporting nonserious benchmarks.
Would suggest Eric TTs papers on universal difference-in-differences and generalized difference-in-differences as they are essential equi-confounding and calibration correction approaches but generalize beyond just additive scale and allow for different outcome types.
Very interesting! I became quite obsessed with negative controls/test lately 😅
I’m trying to put several methods in here: github.com/etverse/negatr
Oh nice! For the difference-in-differences approach are you assuming additive scale aqui-confounding?
For longer discussion of the underlying identification assumptions and their plausibility in any real-world scenarios see our recent Epidemiology paper:
journals.lww.com/10.1097/EDE....
New blog post:
christopherbboyer.com/posts/2025-1...
A simple simulation to show when/how test-negative results can be used to correct unmeasured confounding.
Maybe they’ll reconsider after they’ve had time to put their phone down and d-connect.
Hmmm blocked by DAG… you must have really confounded them 🙃
I would also say that original vanilla IV and DID and RDD were much easier to implement (I guess if you do two stage linear models version of PCI it’s about as easy as IV but I’m not sure this was widely dissseminated)
In one sense you’re right, but don’t IV and DID themselves fit neatly within PCI framework as special cases (ie DID as a form of negative outcome control with additional parametric restrictions and IV as unconfounded negative exposure control)?
Better than in another grant application.
And of course we’re happy that, in the end, we ended up with a healthy happy baby; and privileged that we could swing it financially. But it’s a system that needs fundamental reform.
Also this was in Massachusetts where at least there was mandated insurance coverage for IVF. And yet still we paid 000s out of pocket.
Glad this is receiving more scrutiny. Our own fertility journey included not only encounters with private equity owned clinics but also black market deals for drugs due to manufactured shortages. Shares many of the predatory tactics of other industries that prey on vulnerable people.
screenshot of my post
Big new blogpost!
My guide to data visualization, which includes a very long table of contents, tons of charts, and more.
--> Why data visualization matters and how to make charts more effective, clear, transparent, and sometimes, beautiful.
www.scientificdiscovery.dev/p/salonis-gu...
Today I had to docu-sign some legal agreements for grants and noticed they now offer an AI summary that they warn “may be inaccurate”… Truly what are we doing here fam?
Now we have sludge units
2. The relevant identification assumption here also isn’t exchangeability anyway it’s parallel trends which is both slightly weaker (in that it allows some forms of unmeasured confounding) and stronger (in that it imposes parametric restrictions on possible DGPs)
Two points: 1. I guess what I’m saying is that it’s not that conditioning on the group is causing bias but that the confounding structure could be different for the subgroup. Eg draw me the DAG/SWIG where conditioning on birth, which is also the treatment variable of interest, is an issue.
I maintain that this is an excellent benchmark for d-type effect sizes:
Sleep satisfaction & duration declined with childbirth & reached a nadir during the first 3 months postpartum, with women more strongly affected (satisfaction d = -0.79, duration minus 62 min, d = -0.90)>
Yes! And to be clear, as a father to a 3- and 1-year old, I still think this is among the cleanest causal effects one could imagine haha.
E.g. doing post exposure prophylaxis when my mean time to prophylaxis is 3 days may be very different than when mean time is 7 days, even if both are well defined and identified.
They may be systematically different, but I could still run a well-defined trial of post-exposure intervention by enrolling them and randomizing among subgroup who survives. My effect would just entirely depend on distribution of who survives and would be highly “particularistic”.
E.g. I’m interested in effects of post-exposure interventions in infectious disease. For short incubation period infections, every day that I delay post exposure defines a different subgroup of people surviving without outcome.