This pivotal work is the result of a collaborative effort led by Micaela E. Consens, with contributions from Cameron Dufault, Michael Wainberg, Duncan Forster, Mehran Karimzadeh, Hani Goodarzi, Fabian J. Theis, Alan Moses.
@uhnresearch.bsky.social
@vectorinstitute.ai
@uoft.bsky.social
Posts by Bo Wang
โก Strengths, Limitations, & Future Directions: Gain insights into the current capabilities of genomic AI, its limitations, and the promising avenues for future research and application.โ
๐ Comparative Analysis of Models: We delve into the evolution from sequence-to-function models like DeepSEA and Enformer to sequence-to-sequence models such as DNABERT and Evo, highlighting their respective strengths and applications.โ
๐ Beyond TransformersโIntroducing HyenaDNA: Explore innovative architectures like HyenaDNA, which offer efficient long-range genomic sequence modeling at single nucleotide resolution, pushing the boundaries of genomic research.โ
๐ง Transformers in Genomics: Discover how transformer architectures, renowned for their success in natural language processing, are adept at capturing long-range dependencies in genomic data, leading to more accurate models.โ
Key Highlights:
๐ฌ The Challenges Addressed by gLMs: gLMs tackle the intricate task of interpreting vast genomic sequences, enabling predictions about gene regulation, variant effects, and more.โ
๐ฅ Unveiling the Future of Genomics with Genome Language Models (gLMs)! ๐ฅ
Our comprehensive review, "Transformers and genome language models," is finally published in Nature Machine Intelligence! โ
Link: nature.com/articles/s42...
๐ A huge team effort behind this work, with special appreciation to BowenLi Lab
for driving the project. Kudos to Haotian Cui, Yue Xu, Kuan Pang, Gen Li and Fanglin Gong!
๐ Beyond mRNA drugs, LUMI-lab exemplifies a scalable framework for AI-driven molecular discovery, pushing boundaries in material science & drug delivery.
๐ Read the preprint: ๐ biorxiv.org/content/10.1...
๐ป Code available on GitHub: ๐ github.com/bowenli-lab/...
๐ Why it matters?
LNPs are the backbone of mRNA therapeutics, yet discovery has been slow due to data scarcity. LUMI-lab shows that AI-powered autonomous labs can accelerate mRNA delivery innovation๐๐ก
- 1,700+ new LNPs synthesized & tested across 10 iterative cycles
- Brominated lipids autonomously identified as a novel structural feature that enhances mRNA transfectionโan insight previously unrecognized in LNP design
- 20.3% in vivo CRISPR gene editing efficiency in lung epithelial cells
๐ฅ Key Highlights:
- Foundation model trained on 28M molecules using a three-step strategy:
- Unsupervised pretraining to capture broad molecular knowledge
- Continual pretraining to specialize in lipid-like molecules - Active learning fine-tuning within a closed-loop experimental system
๐ฌ What is LUMI-lab?
LUMI-lab integrates molecular foundation models with autonomous robotic experiments to efficiently explore new LNPs (lipid nanoparticles, mRNA delivery vehicles) with minimal wet-lab data.
How can generative AI and Robotics help advance drug discovery๏ผ
๐ Excited to introduce LUMI-lab!
A foundation model-driven Self-Driving Lab (SDL) for autonomous ionizable lipid discovery in mRNA delivery ๐ค๐
๐ Results speak for themselves:
- 63.1% accuracy on ChestAgentBench
- State-of-the-art performance on CheXbench
- Outperforms both general-purpose and specialized medical models
๐ Huge shoutout to
Adibvafa, Jun, Alif, and Hongwei for their exceptional work on this project!
๐ Introducing ChestAgentBench:
We're also releasing ChestAgentBench, a comprehensive medical agent benchmark built from 675 expert-curated clinical cases, featuring 2,500 complex medical queries across 7 categories.
Check it out: huggingface.co/datasets/wan...
๐ก Key Features:
- Unified Framework: Seamlessly integrates specialized medical tools with multimodal large language model reasoning.
- Dynamic Orchestration: Intelligent tool selection and coordination for complex queries.
- Clinical Focus: Designed for real-world medical workflows and deployment.
๐ ๏ธ Integrated Tools:
- Visual QA: CheXagent & LLaVA-Med
- Segmentation: MedSAM & ChestX-Det
- Report Generation: CheXpert Plus
- Classification: TorchXRayVision
- Grounding: Maira-2
- Synthetic Data: RoentGen
๐ฏ Why MedRAX?
While specialized AI models excel at specific chest X-ray tasks, they often operate in isolation. Medical professionals need a unified, reliable system that can handle complex queries while maintaining accuracy. MedRAX bridges this gap!
What is MedRAX?
MedRAX is the first versatile AI agent that seamlessly integrates state-of-the-art chest X-ray analysis tools and multimodal large language models into a unified framework, enabling dynamic reasoning for complex medical queries without additional training.
Agentic AI Meets Medicine!!!
๐ฌ Excited to announce MedRAX: a groundbreaking Medical Reasoning Agent for Chest X-ray interpretation, now on arXiv!
Paper:https://arxiv.org/abs/2502.02673
Code: github.com/bowang-lab/M...
Huge shoutout to the incredible PHD students Chloe Wang and Haotian Cui for leading this groundbreaking project! ๐
Massive thanks to our amazing co-authors Andrew, Ronald, and Hani ( @genophoria.bsky.social )from
@arcinstitute.org
โthis work wouldn't have been possible without you! ๐
๐ Read the preprint: biorxiv.org/content/10.1...
๐ป Explore the code/weights: github.com/bowang-lab/s...
#SpatialTranscriptomics #SingleCell #AIResearch #MachineLearning #SpatialData
โจ Multi-Modal & Multi-Slide Integration โ Seamless clustering & spatial domain identification across slides and modalities.
โจ Cell-Type Deconvolution & Gene Imputation โ Unlocks cross-resolution & cross-modality harmonization with fine-tuned embeddings.
โจ Revolutionary MoE Decoders โ A cutting-edge Mixture of Experts (MoE) architecture for protocol-aware gene expression decoding.
โจ Spatially-Aware Training Strategy โ A neighborhood-based masked reconstruction approach to capture complex cell-type colocalization.
๐ฅ Why scGPT-spatial?
โจ A Spatial-omic Foundation Model with Continual Pretraining โ Built on scGPTโs robust initialization, it unlocks spatial context in tissues.
โจ SpatialHuman30M Dataset โ The largest curated dataset: 30M profiles from Visium, Visium HD, Xenium, and MERFISH across 821 slides.
๐ง Whatโs the challenge?
Spatial transcriptomics is next-level complexโnot only must we model single-cell/spot profiles, but we also need to capture intricate spatial relationships while handling diverse sequencing protocols (imaging-based vs. sequencing-based).
๐ Introducing scGPT-spatial! ๐งฌ๐
A game-changing spatial-omic foundation model, built on the powerful scGPT framework with MoE (mixture of experts) and continually pretrained on a massive 30 million spatial single-cell profiles!
A robot hand trying to play snooker.
Our Jan issue is live! nature.com/natmachintell with an article (Yejin Choi et al) and N&V commentary (Molly Crockett) on Delphi, designed to investigate AI moral reasoning. Also read about IntegrateAnyOmics by @bowang87.bsky.social, an unsupervised platform to tackle incomplete multi-omics data.
๐ Learn more:
BioRxiv: biorxiv.org/content/10.1...
Code: github.com/bowang-lab/M...
๐ Led by the amazing PhD student Navidi Zeinab
, co-supervised with Benjamin Haibe-Kains. Big thanks to Jun Ma, Esteban Miglietta, Le Liu, and Anne Carpenter & Beth Cimini for their invaluable contributions!