Gennady Gorin (@goringennady) Bsky

Unfalsifiable by Design: A Year of Trying and Failing to Reproduce a Human Microbiome and Autism Study The myth of open data, reproducibility, responsibility, and accountability in science, and your role in it

How every layer of science's "self-correcting machinery" failed when Iva Veseli and I simply wanted to reproduce the findings of a high-profile study on gut microbiome and autism:

merenlab.org/2026/04/15/u...

6 days ago 161 80 12 21

I am holding out for measure theory in Vol. VII

1 week ago 1 0 1 0

If you use dim. reduction, you may be interested in two recent preprints we've posted on contrastive PCA:
The Rayleigh Quotient and Contrastive Principal Component Analysis I & II
w/ Maria Carilli & Kayla Jackson. They cover a lot of ground from theory to practice. 1/🧵

1 week ago 38 12 1 1

Latest from Shendure & Qiu labs (@cxqiu.bsky.social)
)! We combined a new 4M cell mouse whole embryo scATAC-seq atlas (E10-P0), millions of 'evolutionarily coherent' orthologs from 241 mammalian genomes (Zoonomia), and the CREsted CNN framework (@steinaerts.bsky.social).

1 week ago 39 16 1 0

Ordinal Modeling

Friendly reminder that ordinal values admit an ordering, but no notion of distance. Without a notion of distance even the concept of linear models is ill-defined. Please do not use ordinary least squares to analyze ordinal data.

For gratuitous discussion see betanalpha.github.io/assets/chapt....

2 weeks ago 37 6 2 1

Monod: model-based discovery and integration through fitting stochastic transcriptional dynamics to single-cell sequencing data - Nature Methods Monod fits biophysically motivated models to single-cell transcriptomics data, empowering multifaceted and integrative insights of gene expression dynamics, stochasticity and regulation.

Oh, fantastic. You might also be interested in www.nature.com/articles/s41..., where we go into a great deal of detail on the multimodal and technical modeling.

2 weeks ago 2 0 0 0

You know what's better than inflating the variation of your observational model to heuristically accommodate "outliers"? Actually modeling the contaminating data generating process.

2 weeks ago 15 5 0 0

Eukan: a fully automated nuclear genome annotation pipeline for less studied and divergent eukaryotes academic.oup.com/nargab/artic... 🧬💻🧪 github.com/BFL-lab/eukan

1 month ago 11 5 0 0

”Early Modern Memes: The Reuse and Recycling of Woodcuts in 17th-Century English Popular Print“ by @katiesisneros, on the interplay of repetition, context + meaning in woodcuts and the parallels to meme culture of today: publicdomainreview.org/essay/e...

1 month ago 62 13 0 2

As for ethics, that's for the ethicists. Which is to say there's a lot of ground aside from ethics and bare usefulness. In the petroleum industry just as much as the tech industry. I should hope that ground would also come into it.

1 month ago 0 0 0 0

Sure. Plenty of people post to massive audiences about bioplastics imminently taking over, or about photovoltaics being a dead end. Both are misinformation-adjacent, and maybe these people would be better off starting from the stronger point that the oil industry has been very useful.

1 month ago 1 0 1 0

Sure. So what?
Or to put it another way: I am a chemical engineer from Texas. I can go on about how useful the oil industry is and has been for 150 years or so. And I would be correct. But so what? Is that a convincing middle ground to stop at? Maybe it is. But I am not sure everyone would agree.

1 month ago 0 0 1 0

I do not believe there is. Because beginning and ending the discussion there (so treating every other aspect as beneath consideration or immaterial) is itself extreme.

1 month ago 0 0 1 0

FormatMyPaper - The End of Manuscript Formatting Paste your paper, choose your journal, and let our AI handle the tedious rest. Get back to the science.

"clever" satire in the genre of formatmypaper.com gets tedious if there is no trace of human execution beyond half an idea behind it. It's because layers of contempt for the idea and the reader are the opposite of compelling

1 month ago 0 0 0 0

Every paper invents its own idiosyncratic and unique analysis. In seeming contradiction, they also crib or inherit analyses, whether sensible or not, from previous papers. Perhaps someday these best practices will coalesce to be summarized in such a paper (they would have to be created first)

1 month ago 4 0 0 0

The Devil and Daniel Webster AI

1 month ago 0 0 1 0

Einstein AI C&D charging
bsky.app/profile/gori...

1 month ago 1 0 1 0

RIP ishmael you would have loved this 🐋

1 month ago 9 3 0 0

Ambient RNA & barcode swapping is a serious issue in single-cell genomics. Tools such as CellBender, scAR, DecontX & SoupX. We have developed CellSweep which is faster (in some cases by a lot) and much more accurate. Extensively tested and benchmarked. www.biorxiv.org/content/10.6... 1/

1 month ago 58 15 2 0

New COSIG update! 3️⃣4️⃣

1 month ago 2 1 0 0

Gennady Gorin, Ph.D. Senior Scientist applying stochastic models for therapeutic discovery

I am currently looking for work, so do not hesitate to reach out if this experience sounds interesting! You can find the rest of my portfolio at gennadygorin.github.io. 13/13

1 month ago 0 1 0 0

This is a complex and exciting topic, and we have a lot to learn about modeling, artifact detection, and the sheer range of unexpected biology we can learn from existing experiments!

Big thanks to @lindabgoodman.bsky.social, and to the Gracheva/Bagriantsev labs that published the data! 12/

1 month ago 0 0 1 0

RNA trafficking and local protein synthesis in dendrites: an overview - PubMed It is now widely accepted that mRNAs localize to dendrites and that translation of these mRNAs is regulated in response to neuronal activity. Recent studies have begun to reveal the underpinnings of these processes and to underscore the importance of local protein synthesis to synaptic remodeling an …

But this is a squirrel hypothalamus dataset, and here we also see the RNA coding for hypothalamic neuropeptides, distributed as in the real cells: Pomc, Sst, Npy, Agrp, and Cartpt! Perhaps a trace of RNA secretion or trafficking to the dendrites? 11/

1 month ago 0 0 1 0

Yet even if we do our filtering, some genes still come up non-Poisson, and many of them end up mutually correlated! We have seen this before with hemoglobins and mitochondrial genes, because their packaging is incorporated into empty drops and the RNA are captured together. 10/

1 month ago 0 0 1 0

Adjust your read filtering accordingly: some TSO content is fine, but feature barcoding primers are bad news.

Not all the artifacts are so easy to spot, and I go into a great deal of detail about different classes of issues. These are only a starting point. 9/

1 month ago 0 0 1 0

Feature barcoding primers!

Somehow 👀 the primers are missing their UMIs and antibody capture sequences. The result: vast numbers of reads with TSO/primer/FB cell barcode/poly(A). If the barcode/poly(A) is close enough to a real transcript, it gets counted, giving outliers. 8/

1 month ago 0 0 1 0

Some of them are TSO artifacts that happen to be similar to poly(A) regions in the transcriptome. Some are real reads that happened to get filtered out in the second pipeline. But it turns out that the vast majority come from... 7/

1 month ago 0 0 1 0

It turns out that some of them are pipeline artifacts! If we rerun the same dataset with a different aligner, many of these outliers disappear. Can you guess what these reads really are? 6/

1 month ago 0 0 1 0

But this preprint not about the molecular soup: it is about the genes that stubbornly refuse to look like soup. They are clear outliers and look nothing like Poisson. 5/

1 month ago 0 0 1 0

Since the statistics of empty drops are so simple, we can use them to interrogate models of technical noise. Empty drops are nearly Poisson; by investigating how much they diverge from the Poisson, we can say something about the experiment. 4/
www.biorxiv.org/content/10.1...

1 month ago 0 0 1 0

Posts by Gennady Gorin