Panmap: Scalable phylogeny-guided alignment, genotyping, and placement on pangenomes
www.biorxiv.org/content/10.6...
Posts by Sina Majidian
🧬 Join the VESS Organizing Committee! We are looking for early-career scientists to help shape our global, community-driven seminar program.
📅 Deadline to apply: April 30, 2026
ℹ️ Learn more and apply👇
forms.gle/vdLh3mu9Fuid...
#VESS #VariantEffects #Seminar #Genomics #EarlyCareer
ALL EvolDir subscribers - take action to continue receiving emails from us:
EvolDir is subject to different data protection regulations in the EU; ESEB must have your consent to send you EvolDir emails. Look out for an email soon to update your preferences on the content you would like to receive.
That’s a gem! I use these for my K99 and faculty job applications. Thanks to all contributors!
This is from 7 years ago (merenlab.org/2019/02/24/f...). We are talking about the same things today. We will be talking about the same things 7 years from now. There is no one to blame for this apart from ourselves. I find it very depressing.
Appeals: what, why, when, how
www.nature.com/articles/s41...
"By an appeal, we mean the formal process of reconsidering a manuscript that has been previously rejected by a journal, either before or after peer review."
Genomic LLMs in practice 🧬🤖
Train and apply transformer models to DNA sequences.
Tutorial organized by: Megha Hegde, Shashank Ravichandran, Jean-Christophe Nebel, Ragothaman Yennamalli and Farzana Rahman
Register here:
👉 eccb2026.org/registration...
#ECCB2026 #Genomics
Interested in predicting the dynamics of antibiotic resistance? Come work with us! We're looking for two postdocs to develop predictive models of resistance. We're interested in a range of approaches (mathematical & statistical modelling, causal inference, machine learning).
tinyurl.com/6c4y3jke
(1/2) Interested in plant evolution? We are opening two PhD positions in my lab at QGG - Aarhus University, where you will combine comparative genomics and machine learning to better understand and improve crop traits.
#PlantSciJobs #evolution #genomics
From DNA to discovery: what are the steps of HiFi sequencing? 🧬
This blog breaks down the 5-step PacBio workflow, covering everything from Nanobind sample prep to real-time data analysis on the Revio and Vega systems.
See how it all comes together: bit.ly/4vDRhGW
#PacBio #HiFisequencing #Genomics
blog post - State of the lab 13, part of yearly series, tracking our lab's progress in academia. This year I am mostly reflecting back on our slow adoption of DL methods, FOMO of new technologies and anxiety about the use of AI for research www.evocellnet.com/2026/04/stat...
NatureLM is a GPT-style generative model trained on a diverse range of data, including small molecule compounds, proteins, DNA, RNA, materi- als, and both general and scientific texts, amounting to a total of 143 billion tokens
Fig. 2: The scaling effect in NatureLM is obvious. The chart depicts the overall ranking of models with varying sizes, where a better rank is represented by the “outsider” bar. The 8x7B model achieves top performance in 19 tasks, while the 8B model excels in 3 tasks. 18 categories exhibited performance improvements with increasing model size (i.e., 8x7B demonstrated the best performance, followed by 8B, and then 1B), highlighting the potential of large foundation models for scientific applications.
Nature Language Model: Deciphering the Language of Nature for Scientific Discovery
arxiv.org/abs/2502.07527
Design a heme-binding protein sequence.
⟨protein⟩MSAAEGAVVFSEEKEALVLK· · · ⟨/protein⟩
Generate a molecule with four hydrogen bond donors.
⟨mol⟩C(C[C@@H](C(=O)O)N)CN=C(N)N⟨/mol⟩
Join us for the Celebrating Women in Data Science and AI Symposium on April 14 from 10 a.m. to 4 p.m. Register now to hear from leading women in data science and AI through dynamic talks, panel discussions, and a poster session. ai.jhu.edu/event/celebr...
To simplify, we began hosting a web page to index and simplify tracking RECOMB papers and their preprints and journal versions at recomb.org/proceedings/ This page serves as a central directory for all RECOMB publications.
📢 MicroRNA folks :
We are looking for RNA-seq data comparing transcriptomes of mutants in which one particular miRNA locus of the genome was inactivated (by e.g. a tDNA insertion or by CRISPR) with their wt controls. Any hint ?
Please, don't write cover letters with AI when applying to jobs. Selection committees see so many of them in a short time window and artificial constructions become too evident.
I prefer an honest cover letter with grammar errors, made with effort, rather than a fake one.
Flyer of the course on "Introduction to Biodiversity Genomics". All information are also here: https://biodiversitygenomicslatam.weebly.com/
In July, we will teach the 3rd biodiversity genomics course in Latin America, this time in Bogotá, following the COLEVOL meeting. We invite applications from students, postdocs and PIs from Latin America interested in learning how to analyse genomic data. biodiversitygenomicslatam.weebly.com
A run-length-compressed skiplist data structure for dynamic GBWTs supports time and space efficient pangenome operations over syncmers
doi.org/10.64898/202...
Myloasm, our long-read metagenome assembler, is now published! w/ @mgmarin.bsky.social and @lh3lh3.bsky.social
Very rewarding after > a year of development and countless hours thinking about assembly. Thanks to beta testers, Li lab, and reviewers who gave very helpful feedback.
rdcu.be/famFj
I am looking for short papers using phylogenetic trees to test alternative hypotheses regarding evolution that can easily be grasped by undergraduate students in biochemistry. Please RT and share your ideas.
Congratulations to 3rd-year undergrad Steven Tan (first author!!), @benlangmead.bsky.social, @mohsenzakeri.bsky.social, & @sinamajidian.bsky.social on their Best Paper Award at @acm-bcb.bsky.social 2025! 🏆
How much protein diversity can Life on Earth actually generate?
With DIAMOND DeepClust, we show how billions of proteins across the tree of life can be clustered at low-identity for downstream analytics tasks.
📚Paper: www.nature.com/articles/s41...
💻Code: github.com/bbuchfink/di...
Fig. 1 | A visual overview of ANNEVO’s architecture. a, Context extension component: this panel illustrates how ANNEVO tackles the challenge of insufficient context at the edges of sequence segments. The genome is divided into consecutive core regions using a sliding window. Each core region is extended by flanking sequences on both sides, providing additional context for the model. To ensure robustness during training, a soft masking strategy is applied to both the flanking regions and preidentified erroneous regions, preventing these regions from disproportionately influencing the training process. b, Neural network component: this module enables end-to-end position- wise predictions by modeling both long-range interactions within sequences and multiple sublineages across a diverse set of species. c, Gene structure decoding component: this module defines the gene structure states of eukaryotic species and reconstructs biologically valid gene structures by applying soft connection and decoding algorithms to the position-wise prediction of each segment.
Fig. 3 | Benchmarking against both evidence-assisted annotation pipelines and deep learning methods on model species. a, Nucleotide-level performance on Sus scrofa. ANNEVO achieved optimal F1 scores (0.934), outperforming BRAKER3 (0.741) in integrative metrics of completeness and false-positive rate.
Extended Data Fig. 1 | Detailed model architecture of ANNEVO’s neural network component. a, Distal Information Modeling Module. This module extracts local sequence patterns using five consecutive ConvBlocks and learns long-range dependencies through positional encoding and Transformer encoder layers. The parameters are as follows: C = 64, H = 8, D = 768. b, Joint Evolutionary Modeling Module. c, Resolution Restorer Module. The Resolution Restorer Module serves as the inverse process to the ConvBlocks, designed to reconstruct the feature vector back to nucleotide resolution. d, Detailed Architecture of Network Blocks. The ConvBlocks progressively increase the number of channels, with the convolutional layer channels expanding from C to 5 C.
Highly accurate ab initio gene annotation with ANNEVO
Nature Methods (2026)
www.nature.com/articles/s41...
oops thanks!
If you're a senior PhD student or recent graduate interested in doing work at the intersection of AI and microbial genomics and/or microbiomes, please feel free to reach out to discuss the possibility of applying together for an MIT Novo Nordisk Fellowship engineering.mit.edu/novo-nordisk
I’ll be presenting on Computational Comparative Genomics at the Data-Driven Systems Biology event in Stockholm on Tuesday, 24 March.
Looking forward to connecting with new colleagues at SciLifeLab!
Event webpage: www.scilifelab.se/event/data-d...
Can ever-increasing sequence databases improve phylogenetic reconstruction of a gene family? Our new preprint introduces AmpliPhy, a pipeline that automates homolog enrichment to improve gene tree inference, built on a robust phylogenomic benchmark scheme. 🧵1/n
📃 doi.org/10.64898/2026.01.26.701724
𝗣𝗼𝘀𝘁𝗱𝗼𝗰 𝗮𝗻𝗱 𝗣𝗵𝗗 𝗽𝗼𝘀𝗶𝘁𝗶𝗼𝗻𝘀 𝗶𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗚𝗲𝗻𝗼𝗺𝗶𝗰𝘀 / 𝗔𝗹𝗴𝗼𝗿𝗶𝘁𝗵𝗺𝗶𝗰 𝗕𝗶𝗼𝗶𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗰𝘀
I am currently recruiting for both:
🔹 Postdoc position
su.varbi.com/what:job/job...
🔹 PhD position
su.varbi.com/en/what:job/...
Please share with anyone who might be interested!
Summaries of variant characteristics from the whole cohort. (A) Comparison of overall var- iant counts, separated by variant type (SNV, indel, or mixed, i.e., both) between variant calls from the different assemblies. (B) Overall numbers of variants by predicted impact in the whole call set as deter- mined by SnpEff for each assembly. Variants with impact rating MODIFIER were excluded. (C) Distribution of allele frequencies in the three assemblies. (D) Fraction of each variant type by allele fre- quency using T2T-CHM13. Frequencies were rounded to the closest multiple of 0.001 (1/1000).
T2T-CHM13 improves read mapping and detection of clinically relevant genetic variation in the Swedish population
genome.cshlp.org/content/35/1...
EvolCompGen presents: Christian Landry on the Evolution of gene duplication, from redundancy to dependency
🧬EvolCompGen presents: Christian Landry on the Evolution of gene duplication, from redundancy to dependency
@christianlandry.bsky.social
🗓️ Join us on Monday, March 23 at 11:00 AM EDT
📌 Register here: us02web.zoom.us/webinar/regi...