Metabolic fingerprinting of 17 Brassicaceae species across three tissues www.biorxiv.org/content/10.64898/2026.04...
Posts by Zhigui Bao 鲍志贵
Getting close to a robust pipeline for ARG inference for messy genomes.
1. get a vcf. Align short reads to a reference and GATK and pray, or use whole genome assemblies and github.com/baoxingsong/... followed by github.com/RILAB/argprep
In Arabidopsis, DeepVariant achieved good balance with FP / FN when we use AnchorWave alignments as ground truth. It’s more conservative with much less FP ( much less false heterozygous sites than GATK and much faster)
Excited to see our work out in Science today! Using machine learning to identify prokaryotic immune systems www.science.org/doi/10.1126/...
Our new experimental evolution study across 30+ locations using the plant Arabidopsis thaliana —— we direct "see" adaptation and extinction to different climates at the genetic as it happens!
Read it in Science
dx.doi.org/10.1126/scie...
@ucberkeleyofficial.bsky.social
@hhmi-science.bsky.social
Now out!
We show that TEs can be horizontally transferred between fungal species via Starships. Once transferred, these TEs can become active, changing the genome organization and affecting the lifestyle of the recipient fungus.
www.nature.com/articles/s41...
@oggenfussursula.bsky.social #TEsky
How much protein diversity can Life on Earth actually generate?
With DIAMOND DeepClust, we show how billions of proteins across the tree of life can be clustered at low-identity for downstream analytics tasks.
📚Paper: www.nature.com/articles/s41...
💻Code: github.com/bbuchfink/di...
LongcallD: joint calling and phasing of small, structural and mosaic variants from long reads www.biorxiv.org/content/10.64898/2026.03...
Tandem and tandem
Excited to share my first preprint from my PhD w/ @justinmcrocker.bsky.social. We show that cell type-specific regulatory dominance promotes robustness and evolutionary innovation through interallelic transcriptional hubs, potentially expanding the mutational paths available to diploids. (1/18)
We only put the SL4->SL5 gene table the website. Hope this is not too late keeper.mpdl.mpg.de/d/05b268b3ee..., the chain was generated with wfmash alignment and covert with wgatools
1/11 🔥 New preprint alert 🔥
We wanted to know what plants in the wild really care about. So we asked them 🎤.
Here is what we learned: “Biotic-response networks are an important organizer of the transcriptome in wild Arabidopsis thaliana populations”
www.biorxiv.org/content/10.6...
@pratikkatte.bsky.social and I just released Lorax 🌲, a tool for interactive exploration of biobank-scale ancestral recombination graphs (ARGs).
If you’ve ever wanted to scroll across the ancestries of thousands of genomes… this is for you.
Recently we're working with SNPs from whole genome assemblies to estimate ARGs. It's a pain to go from alignment files to vcf, keeping track of masked and invariant sites. So we wrote a snakemake/SLURM pipeline. Hope it's useful to others, and don't hesitate to post issues if there are problems!
Molecular and phenotypic footprints of climate in native Arabidopsis thaliana www.biorxiv.org/content/10.64898/2026.03...
We are very excited to share a new resource from our team: spatial subcellular proteome maps in plants! We developed an MS-based method that registers localizations of about 8000 proteins in Arabidopsis roots in a single experiment.
(1/9)
www.biorxiv.org/cgi/content/...
“Aneuploidy was identified in inbreeding and outbreeding populations of cultivated potato, with frequencies ranging from 14.8 to 24.0%, indicating notable genomic instability.” Plant genomes tolerate instability—more akin to human tumor evolution than stability.
www.science.org/doi/10.1126/...
New paper alert: Paralog interference contributes to the preservation of genetic redundancy www.cell.com/current-biol...
Heya science peeps, my first first-author paper is on Biorxiv! We show how transcriptome-wide expression variability in outbred animals responds massively to an environmental stressor and is underpinned by cryptic variability- (not just mean-) controlling alleles. www.biorxiv.org/content/10.6...
This article is now published! academic.oup.com/nargab/artic...
We’ve added a few new analyses. First off, we show that, while gene presence absence variation (PAV) scales with evolutionary distance in both plants and animals, the base level and rate of accrual are both twice as high in plants.
Wow! This is consistent with our observation, helixer learn really well in the reference annotation, but when genes are not present in the reference, the annotation will be harder. As it does quite a good job for core genes. Another tool ANNEVO also have similiar small introns/exons issue
I had intended to post something about this new Google DeepMind paper that appeared yesterday in Nature, but the press coverage has added to what there is to say. So this is a long 🧵
www.nature.com/articles/s41...
I am looking for a postdoc to develop high-performance algorithms in computational genomics. Email or DM me if interested. For more information, see hlilab.github.io/vacancies. RTs appreciated!
Just sharing new tool written by Vincent Ranwez to view and manipulate sequences and alignments directly in your terminal
github.com/ranwez-searc...
Pretty convenient!
Characterising the detectable and invisible fractions of genomic loci under balancing selection www.biorxiv.org/content/10.64898/2026.01...
The (Yoav) Voichek lab has opened its gates at the Weizmann Institute, and is actively recruiting students and researchers at all levels - come explore gene regulation and computational genomics in a fun, friendly sprouting lab 🤗🥼⚗️🧪
www.weizmann.ac.il/plants/voichek
Novel genes arise from genomic deletions across the bacterial tree of life www.biorxiv.org/content/10.6... 🧬🖥️🧪 github.com/aryakaul/pre...