What is the PhD actually for, especially now that AI can do increasingly more of what we train scientists to do? compbiologist.substack.com/p/what-is-th...
A response to @pracheeac.bsky.social's thought-provoking essay "Free the PhD".
Posts by Arjun Krishnan
Sean Davis & I are hiring a postdoc to work on turning massive public biological data collections into reusable engines for discovery.
#ML / #AI + large-scale omics + open software
Details + apply: cu.taleo.net/careersectio...
seandavi.github.io | thekrishnanlab.org
@cubiomedinfo.bsky.social
A lab using my recent articles on AI use during PhD training doi.org/10.5281/zeno... & doi.org/10.5281/zeno... to kickstart a discussion and draft guidelines for their group is exactly the kind of use I hoped these articles would inspire! Highly recommend reading Ran's post.
I wrote about why every lab should have AI use guidelines, and how to do it.
open.substack.com/pub/blekhman...
@compbiologist.bsky.social and @fishevodevogeno.bsky.social present newly minted Dr. Hao Yuan @yhbioinfo.bsky.social! @michiganstateu.bsky.social Amazing work @ the intersection of computational, biomedical, and evolutionary biology! Hao starts a postdoc in the @edwardmarcotte.bsky.social Lab soon.
Recognition Digest: February 27, 2026 - March 6, 2026 Summary: 0 Recognitions Received by Your Team Recognitions Looks like no one in this group has been recognized recently. Start the momentum and send a recognition!
We have a new platform for employees to give and receive kudos, and it's weekly digests are brutal!
Enjoyed working w/ Stephen & Agnes on this!
Gave me a chance to think systematically about where AI 🤖 can backstop human 🧑🏼🔬 fallibilities in peer review (fatigue, ordering effects, bias) vs. where human judgment remains essential (novelty, feasibility, creative leaps) while grappling w/ the risks.
Peer review reliability is shockingly low. Meta-analyses show reviewer agreement barely above chance, and grant outcomes often depend more on who reviews than what's proposed. Our new preprint with Agnes Urban and Arjun Krishnan @compbiologist.bsky.social : papers.ssrn.com/sol3/papers.... 🧵 1/
8/8 PhD programs, mentors, & professional societies urgently need community standards built on developmental frameworks, not ad hoc policies shaped by convenience.
Thanks to NSF for the support.
We welcome feedback from the community to refine these ideas!
#PhDLife #Bioinformatics #OpenScience
7/8 The practical Guide 📋 (zenodo.org/records/18452319) provides task-specific protocols for computational data analysis, manuscript writing, literature review, and more. Designed to be adapted by institutions, programs, and labs for their specific training contexts.
6/8 The Perspective article 📄 (zenodo.org/records/18649847) presents the conceptual framework grounded in empirical evidence from learning science. It introduces principles like "expertise before augmentation" and explains why cognitive automation during training undermines development.
5/8 The solution is sequencing: Build foundational expertise FIRST through deliberate, feedback-driven practice. Then use AI to augment that expertise. The threshold isn't a fixed number of attempts; it's demonstrated independent mastery: complete tasks, explain reasoning, catch errors.
4/8 We introduce the "verification paradox": trainees can't meaningfully verify AI outputs because verification requires the very expertise they're still developing. Using AI before that expertise exists bypasses the developmental process while producing polished outputs that mask the gap.
3/8 The key insight: GenAI is categorically different from previous automation (calculators, statistical software, search engine). Those automated mechanical execution. GenAI automates cognition itself: reasoning, synthesis, judgment. This difference changes everything about when it should be used.
2/8 The debate has polarized between "adopt now or fall behind" vs "AI will destroy learning." Both miss the critical question: not WHETHER ❓ to use AI, but WHEN ⏳.
PhD programs worldwide face an urgent question: How should trainees use ChatGPT, Claude & similar tools?
Now online: two resources on thoughtfully integrating generative AI into research training.
📄 Conceptual framework: zenodo.org/records/18649847
📋 Practical guide: zenodo.org/records/18452319
🧵
From Arjun Krishnan @compbiologist.bsky.social
Expertise before augmentation: a practical guide to using generative AI during research training: zenodo.org/records/1845...
Build expertise first: why PhD training must sequence AI use after foundational skill development: zenodo.org/records/1864...
Happy to share this new, very intentional chapter. I have left UCLA after 14 years to join the University of Colorado Anschutz as Professor of Biomedical Informatics and Neurosurgery and the inaugural Marsico Chair in Excellence in Functional Precision Medicine/n
news.cuanschutz.edu/dbmi/cu-ansc...
Our Fish EvoDevoGeno Lab @michiganstateu.bsky.social has its 10th anniversary today! 🐠🐟🧪🧬🔬
Thanks to all lab members - present & past, pictured or not - for making the last decade a success!
& thanks to our partners in crime of the @brainyfishguts.bsky.social Lab, too!
#EndlessFishMostBeautiful
4/4 This advanced short course formalizes the instruction of these ideas. The goal is to:
1) Discuss common misunderstandings & typical errors in the practice of statistical data analysis.
2) Provide a mental toolkit for critically thinking about statistical methods & results.
Feedback welcome 🙌🏼
3/4 As a result, most students piece together a mental model of acceptable, standard, or "best" practices in their field from shards of information gathered from mentors, peers, and published papers.
2/4 Statistical inquiry, data analysis, and visualization are immensely powerful, but many of the ideas underlying them are nuanced and unintuitive. Unfortunately, these ideas—and the skills needed to apply them to real problems and datasets—are rarely taught in statistics or data-analysis courses.
HMGP 7622 Rethinking Data Analysis — A researcher’s guide to avoiding missteps and misuse Feb 3 – May 5, 2026 | Tue 2–3:30p OVERVIEW This is a short (1-credit) course designed to: 1) Discuss common misunderstandings & typical errors in the practice of statistical data analysis. 2) Provide a mental toolkit for critically thinking about statistical methods and results. TOPICS Estimating error, uncertainty • Underpowered statistics • Multiple testing • P-hacking • Pseudoreplication • Regression to the mean • Double dipping • Spurious associations • Visualization challenges • Reproducibility, replicability PREREQUISITES 1) Introductory knowledge of statistics & probability 2) Introductory experience with data wrangling, analysis, & visualization using R/Python. INSTRUCTOR Arjun Krishnan Associate Professor, Department of Biomedical Informatics University of Colorado Anschutz Medical Campus arjun.krishnan@cuanschutz.edu | @compbiologist | thekrishnanlab.org
I'm looking forward to re-teaching:
Rethinking Data Analysis — A researcher’s guide to avoiding missteps and misuse
This is an advanced short course on developing a mental toolkit for rigorous practice & critical consumption of statistical data analyses. 🧵 1/4
A Perspective reviews computational methods for cross-species knowledge transfer.
www.nature.com/articles/s41...
10/10 Big thanks to NIH/NIGMS, NSF, &
@simonsfoundation.org for funding this work!
We welcome feedback from the community! 🙌
#Bioinformatics #TranslationalResearch #OpenScience
9/10 By embracing data-driven, evolution-agnostic approaches, we believe that the field can accelerate discoveries in both common and rare diseases, improving model organism selection and ultimately paving the way for more reliable therapeutic interventions.
8/10 Key future directions we highlight:
- Capturing specific facets of complex diseases
- Building networks for more species & contexts
- Automated ontology/knowledge graph construction
- Better benchmarking for cross-species single-cell methods
- Leveraging non-traditional research organisms
7/10 Kudos to resources like @geneontology.bsky.social , @monarchinitiative.bsky.social, @alliancegenome.bsky.social, & @bgee.org for grounding so much data & knowledge in this space in structured formats. These & many others are included in our catalog ☝🏽
6/10 We provide detailed resources to help computational & wet-lab researchers find, improve-upon, and apply appropriate methods:
📊 Supp Table 1: Comprehensive catalog of methods (name, category, input/output, data types)
📚 Supp Table 2 & Note: Valuable datasets for cross-species work