Advertisement · 728 × 90

Posts by Ollie Liu

Thanks to my amazing collaborators: @samsja19.bsky.social , Johannes Hagemann, @shangshang-wang.bsky.social , Jason Wiemels, Jeff Kaufman, and @willieneis.bsky.social
Special shout out to the Nucleic Acid Observatory for the sequencing data, and @PrimeIntellect for compute support.

1 year ago 1 0 0 0
Post image

We’re sharing METAGENE-1’s:
📄Paper: metagene.ai/metagene-1-p...
🌐Website: metagene.ai
🤗Model weights: huggingface.co/metagene-ai
🧵7/

1 year ago 5 3 1 0

🛡Tailored for detection, not design. We scoped METAGENE-1 to minimize risks while maximizing potential for public health and biosurveillance. Responsible open-sourcing matters. With open weights, we aim to drive progress in interpretability and safe genomics research.
🧵6/

1 year ago 3 0 1 0
Post image

📈METAGENE-1 achieves state-of-the-art results in:
- Pathogen detection
- Genomic embedding benchmarks
- Generalization to multi-species tasks
It already shows promise in public health and biosurveillance, and we are collaborating with experts to unlock its full impact.
🧵5/

1 year ago 5 0 1 1
Post image

The METAGENE-1 model is 7B parameter Llama-style transformer 🦙, pretrained and optimized for anomaly detection, embedding, and multi-species genomics. Fully compatible with 🤗Hugging Face (huggingface.co/metagene-ai) – ready to use like any of your favorite LLMs!
🧵4/

1 year ago 2 0 1 0
Post image

📊The data behind METAGENE-1:
- Brand-new dataset collected with experts from Southern California & Missouri
- 1.5 trillion base pairs from diverse wastewater samples
- Short reads (100–300 BPs), deep sequencing at scale
- Byte-Pair Encoding customized for genomic sequences
🧵3/

1 year ago 2 1 1 0
Post image

Why is METAGENE-1 special? 🤔We trained it on wastewater metagenomics, capturing the human-adjacent microbiome across the US for the past 12 months. This unlocks powerful capabilities for early pathogen detection and microbial ecosystems understanding. 🌱🦠
🌐Website: metagene.ai
🧵2/

1 year ago 2 0 1 0
Advertisement
Post image

Introducing METAGENE-1🧬, an open-source 7B-parameter metagenomics foundation model pretrained on 1.5 trillion base pairs. Built for pandemic monitoring, pathogen detection, and biosurveillance, with SOTA results across many genomics tasks.
🧵1/

1 year ago 27 6 2 0
Post image

Landed at Vancouver to attend #NeurIPS :-) Excited to chat about multimodal models, AI4Science, decision making, and more!

1 year ago 15 0 0 0
Post image

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

1 year ago 104 22 4 4

👋 nlp@usc student. thanks!

1 year ago 3 0 0 0
Post image

tfw you realize that this isn't an alt twitter for academic posting but an alt insta for cute doggos.

this is doodle, our border collie pup that often used as adversarial attacks for image classification models (they classify him as corgi :-)

1 year ago 13 0 1 0

yes please if there's still space left :-P

1 year ago 1 0 0 0
Post image

our border collie pup doodle absolutely wants nothing from that plate of banana :-P

1 year ago 5 0 1 0