Maybe journals aren’t the root cause of problems in academia just a symptom…
Posts by Pierre-Paul Axisa
Exam prep for a research and statistics exam in 2026!
sachaepskamp.com/PL2132_games...
IMO, the real cost of low success rates is not the time wasted on reviews but the time researchers collectively waste developing grant proposals that will not be funded. 1/2
LLM‑Assisted Software Engineers Can Only Be Mad (And Will Block Smartasses)
An impromptu essay on the impossible equilibrium of AI‑augmented coding
(1)
Given the abilities of French gov agencies to implement change it will probably pushed back to 2036. Last yes Inserm changed their accounting software, and it’s a complete mess even since, with many providers not paid and refusing new orders
Meme about researchers getting raw data
Imagine you’re starting the re entry manœuvre and you see this 😅
👀 Absolute banger paper. Leveraging cattle/pig GWAS/eQTL to help explain the missing regulatory gap in humans. Consistent with previous works, selection makes identifying relevant eQTLs in humans challenging.
www.biorxiv.org/content/10.6...
I guess both? I see skilled computational researchers building packages/librairies fast with LLMs, which should increase this. And I see less experienced researchers copy-pasting a bunch of LLM-generated code blocks. Which will be a nightmare to track in terms of of reproducibility/replication
How I feel logging into a virtual machine, then ssh-ing into a cloud server, then connecting to an Rstudio server to access my data.
“Everything that scientists say hinges on the specifics of what they actually did. If you can’t (or don't) evaluate the details, you can’t evaluate the science. I'm grateful to the authors for providing access to those details.“
I have a SNP joke, but I don’t think you’d get the reference.
Often I feel like the most accurate would be to have the trainee as sole author and the PI in acknowledgments 🤪
they can 100% amplify good judgment. You can put it to work on refactoring, testing, docs, debugging, product consistency, etc. But that requires _you_ to focus on it. And _that_ is hard.
Waiting for someone to start a new base R vs tidyverse war all over this. Any minute now ;)
🥇 The €100K Institutional Award goes to the Brazilian Reproducibility Initiative - a nationwide effort to evaluate lab biology research and the largest coordinated replication effort in the field worldwide, catalyzing lasting improvements across Brazil's research ecosystem. Congratulations!
"You've shown why we tell participants not to share their data online"
← So UK Biobank blame cohort members for their own re-identification? From just two quasi identifiers!
This is victim-blaming in place of data governance. Yet again, serious questions need asking about UK Biobank governance.
One way we made it fast was by mapping with simpleaf (github.com/COMBINE-lab/...) 🥳 This gives a ~50x speed-up over CellRanger, and natively produces spliced vs unspliced reads. Percent spliced reads is IMO *the* most important QC metric for single nuclei data (link.springer.com/article/10.1...).
"Preprint servers are a time machine, they move everyone forward 12 months and speed up the exchange of ideas"
ht @pedrobeltrao.bsky.social www.evocellnet.com/2021/06/a-no...
At this point I think my peer review has gotten to the point where I couldn't be replaced by an LLM, but I could be replaced with copy-and-paste 'The authors should add their statistical methods to the manuscript and carefully identify where and how they aggregate data'
Yes I have this hesitation about CLI vs GUI but you’re probably right, better to know the commands and it’s not like there are a lot of them you need to know to get started
I get contacted multiple times a week by users; it's really exciting to hear about all the cool science🌟
If {ukbrapR} supports your research, please cite it so I can track its impact (this really helps me!)
Pro-tip: Use the {grateful} package to automatically include {ukbrapR} in your bibliography
Also for data sharing. I don’t know if it exists already? I’ve come across many instances for example where TPM or CPM matrices shared but not raw counts. It’s a pity when many tools require them.
I die a little every time I hear "oh I don’t use projects I just setwd()". Now that I voiced my concern to colleagues, I have to give a presentation on project-oriented workflows, I think it’s also a good time to introduce git. What’s your favorite teaching materials to get started? #rstats
New paper showing that much of the apparent success of protein language models in predicting mutational effects is a mirage: These models mostly memorize sites. 1/
www.biorxiv.org/content/10.6...
quarto-revealjs-editable has a new release with 5.0.0!!
It is now much more stable, should now work everywhere, plus:
- Rotation support
- Undo/Redo
- Copy qmd to Clipboard
- keyboard controls
github.com/EmilHvitfeld...
#quarto
Issue 35 of #rdmweekly is out!
Including:
➡️ Building Realistic Fake Datasets with Pointblank @richmeister.bsky.social
➡️ Rethinking TXT Files @kbriney.bsky.social
➡️ Economic Benefits of OS @plos.org
➡️ What Is Data Governance? @thegovlab.org
and more!
Link: rdmweekly.substack.com/p/rdm-weekly...
📢 New preprint out!
tl;dr: Publish your code, add clear README's!
138 participants assessed data & code archiving practices across 1861 papers published at 7 @britishecologicalsociety.org journals, identified gaps & offered recommendations for improvement
🔗 doi.org/10.32942/X26...
"Dashboards" are very frustrating for exactly this reason. Needlessly fancy dashboards look impressive to people who don't use them.