Advertisement · 728 × 90

Posts by Lior Pachter

Moreover, I think the data analogy is on point. "Research parasites" are on the whole fantastic, driving forward tons of science. Even if on occasion there is noise. I bristle at the idea that certain groups start to dictate what is and is not acceptable. That's gatekeeping, not community building.

5 days ago 1 0 1 0

... mathematicians (e.g. Terence Tao), have focused on the (half) full portion of the glass. Individuals who previously had less knowledge in an area, can potentially meaningfully contribute now in ways not possible before. The same is true for software. And that should be encouraged.

5 days ago 0 0 1 0

There is an interesting point here that I agree with: there is one difference between data and software that makes software more akin to math: rapidly generated "proofs" in math are noise for the community, and can be time consuming to check. However...

5 days ago 0 0 1 0

Sorry for the confusion. I linked to that issue because "tilting at windmills" suggests I'm wasting my time on a pointless matter, but the fact is that there is active debate on this matter.

My position is that the attempt of Longo and Drazen to police the reuse of published data was misguided.

5 days ago 1 0 1 0

I have no idea whether their code is slop nor whether it's poorly attended. I haven't looked at it.

5 days ago 0 0 1 0

Maybe. Maybe not. github.com/henriksson-l...

6 days ago 0 0 1 0
Preview
Hiding in plain sight: a research parasite’s perspective on new lessons in old data Abstract. High-throughput techniques that measure thousands of analytes at once have become ubiquitous features of biological research. The increasing expe

(Data) research parasites nowadays have tons of exciting opportunities. I love this piece by @skinnider.bsky.social: academic.oup.com/gigascience/...

(Software) research parasites do as well!

6 days ago 2 1 0 0

Of course people can do problematic things with open source software. That's been the case for a long time, and certainly now the ability to be malicious is expanded with LLMs. However, trying to codify problematic scenarios is just an exercise in restricting freedom for others to benefit from OSS.

6 days ago 1 0 2 0
The Research Parasite Awards

To paraphrase @iddux.bsky.social, "I propose to extend the research parasite award to those who used someone else's code to do some really cool sh*t". Their permission or collaboration is absolutely not required. researchparasite.com

6 days ago 0 0 1 0
Advertisement
Preview
Data Sharing | NEJM The aerial view of the concept of data sharing is beautiful. What could be better than having high-quality information carefully reexamined for the possibility that new nuggets of useful data are l...

"Setting out principles" just sounds like drafting a code of conduct to restrict the use of open source software in much the same way that Longo & Drazen advocated for restricting the use of published data in 2016: www.nejm.org/doi/full/10....

6 days ago 0 0 1 0

It's possible that more restrictive licensing can help but I doubt that will matter in practice (who will enforce academic licensing terms?)

6 days ago 0 0 1 0

So the specific versioning v10.0 would not work because the version number tracks my birthday (currently 0.52). But your point is well taken, and yes, that would be annoying but I think something like it is likely to happen (also I've seen much worse in academia).

6 days ago 0 0 1 0
Preview
Hausdorff The philosopher Felix Hausdorff, famous for his foundational philosophical work Das Chaos in kosmischer Auslese  (Chaos in Cosmic Selection) in which he rejects metaphysics, also did mathematics as…

Hausdorff. liorpachter.wordpress.com/2026/04/14/h...

6 days ago 6 1 2 0

If someone writes a paper pretending they discovered pseudoalignment without attributing the discovery to the kallisto authors, and then copies the kallisto software (algorithm, parameters, etc.) to showcase it but gives the software a new name... that's plagiarism and not ok.

1 week ago 0 0 0 0

I have no problem with either Callisto or rullisto. Of course rullisto is more valuable, but people are free to do poor work if they wish. kallisto is BSD-2 and the license allows for that.

BUT....

1 week ago 0 0 2 0
rewrite.bio

I brought nf-core into the conversation because it's yours, and it's MIT, and so nothing in rewrite.bio matters. Anyone can port or modify it, period. And that's not only ok it should be encouraged. If you think "emulate exactly" matters you should change the license.

1 week ago 0 0 1 0
Advertisement
rewrite.bio

rewrite.bio mentions licensing only in section 5.3 in the context of advocating for open source and compliance. At the same time the remaining four sections are a call for onerous requirements which are completely irrelevant under most licensing (e.g. section 2.2 "emulate exactly").

1 week ago 0 0 2 0

Behind every extrernally lauded ‘disruptor’, there seems to be a half dozen (minimum) actual subject matter experts shaking their heads “no” and grimacing dramatically. Good to keep in mind, for those of us prone to a bit of epistemic…moseying.

1 week ago 12 3 1 0
GitHub - pachterlab/XgenePy Contribute to pachterlab/XgenePy development by creating an account on GitHub.

Within my lab edgePython made possible XgenePy which in turn is facilitating another lab to incorporate the method in another tool: github.com/pachterlab/X...

Together these examples make clear to me that porting with LLMs is immediately useful.

1 week ago 0 0 0 0

Moreover, the idea (not software) described in the preprint is discussed in the dreampy paper (which provides an alternate approach): www.biorxiv.org/content/10.6...

1 week ago 0 0 1 0

4. edgePython has already been used for multiple other projects (single-cell, because it's in Python). For example for Allos: www.biorxiv.org/content/10.6...

1 week ago 0 0 1 0

The relevance of single-cell is that I also ported large parts of NEBULA and combined with edgeR into edgePython to create a new single-cell tool.

3. Regarding a pull request in the original language, that's an interesting thought but a lot of extra work. I'm also less comfortable in R than Python.

1 week ago 0 0 1 0

The preprint could have been a README, but a preprint has a DOI and is permanent, and that higher bar provides more confidence in the port.

2. A Python port of edgeR is not just a matter of language. The single-cell ecosystem is in Python, so there is immediate benefit. That's why I did it.

1 week ago 0 0 1 0

I ported edgeR to Python (along with parts of NEBULA) and wrote a preprint: www.biorxiv.org/content/10.6...

Some responses to your questions / comments:

1. I wrote the preprint because the port required demonstrating specific results (parity, runtime, results with a new method).

1 week ago 0 0 1 0

Thanks for the invitation. However, FYI this event has not made me angry. It has made me sad. I do think that engaging with it will make me angry. Since I'd rather be happy than angry, during the event I'm planning to spark productive conversations by working on research projects with my students.

1 week ago 20 0 0 0
Advertisement

Tl;dr: two preprints on contrastive PCA w/ Maria Carilli and Kayla Jackson solve the non-spatial setting, spatial contrastive PCA, and contrastive functional PCA. www.biorxiv.org/content/10.1...
www.biorxiv.org/content/10.6... 17/17

1 week ago 3 2 0 0

This project was tons of fun. It started with a journal club reviewing the @jameszou.bsky.social contrastive PCA paper last fall. Maria Carilli and Kayla Jackson have been incredible in figuring out all the details, including working out all the math (in the supplements). 16/

1 week ago 3 0 1 0
Preview
GitHub - pachterlab/rhopca Contribute to pachterlab/rhopca development by creating an account on GitHub.

k-ρPCA and f-ρPCA are also implemented as part of the github.com/pachterlab/r... package. 15/

1 week ago 1 0 1 0

Thus, ρPCA, k-ρPCA, f-ρPCA are immediately useful across a wide variety of applications. Moreover, they can be combined (e.g. kernel weighted PCA on basis coefficients). The interpretability is especially useful in biology, but we expect also in other scientific domains. 14/

1 week ago 0 0 1 0
Post image

Functional PCA has a long history dating back to the 1990s. I think the contrastive version we introduce via the Rayleigh quotient is going to be very useful. We showcase the power on a bulk RNA-seq time series, finding relevant and interesting differentially variable genes. 13/

1 week ago 1 0 1 0