Advertisement · 728 × 90

Posts by Karel Břinda

How diverse is bacterial immunity ?

We report in @science.org how language models allowed us to predict 2.4M antiphage proteins spanning >23K novel potential systems.
👏 @emordret.bsky.social, @alexhv.bsky.social & al doi.org/10.1126/scie...

Explore them here defensefinder.mdmlab.fr/wiki/refseq_...

2 weeks ago 227 112 10 3
GitHub - samhorsfield96/ggCallaroo: A snakelike pipeline combining ggCaller and Panaroo. A snakelike pipeline combining ggCaller and Panaroo. - samhorsfield96/ggCallaroo

ggCallaroo v0.1.0 is now out! This snakemake pipeline predicts, clusters and annotates bacterial genes using ggCaller, Panaroo and Bakta. It generates Panaroo files with functional annotations already integrated, which can then be used with the usual downstream tools. github.com/samhorsfield...

6 days ago 20 6 0 0
Jobs Working with us

Two new bioinformatics internships available in @johnlees.bacpop.org group at EMBL-EBI: 1) testing and developing ML methods for identification of bacterial promoter regions; 2) Applying innovations in protein structure prediction to search massive datasets. Apply here: www.bacpop.org/jobs/

2 weeks ago 1 4 0 0

Myloasm, our long-read metagenome assembler, is now published! w/ @mgmarin.bsky.social and @lh3lh3.bsky.social

Very rewarding after > a year of development and countless hours thinking about assembly. Thanks to beta testers, Li lab, and reviewers who gave very helpful feedback.

rdcu.be/famFj

3 weeks ago 97 56 4 1
Preview
GitHub - bacpop/ggCaller: Bifrost graph gene caller. Bifrost graph gene caller. Contribute to bacpop/ggCaller development by creating an account on GitHub.

ggCaller v1.5.0 is out! We've removed the integrated clustering to enable users to benefit from new Panaroo features. Now, ggCaller generates GFFs that can be used with any clustering method. But for fans of an integrated ggCaller pangenome workflow read on...

github.com/bacpop/ggCal...

2 weeks ago 21 12 1 0
Vibe Coding, QWERTY, and US Healthcare - or: The Future of Software Engineering? - panthema.net

For those writing code with agents, this *excellent* article by Timo Bingmann (who wrote COBS, for kmer geeks) is super interesting on how one can conceptualise it in terms of dependencies, and how it affects development. V fun analogies (QWERTY, cooking, money)
panthema.net/2026/0318-Vi...

4 weeks ago 13 5 0 0

A really fascinating read – with ideas underlying so many current topics across different subdomains of bioinformatics.

1 month ago 2 0 0 0
Preview
Karel Břinda on research at Harvard and working with bacteria as if they were books Karel Břinda sheds light on how curiosity, mobility, and interdisciplinarity can shape a modern researcher’s path in a world where science increasingly transcends borders.

I was pleased to give an interview to @radiopraguefr.bsky.social about my academic journey across Czechia, France, and the United States, and about my research. english.radio.cz/karel-brinda...

1 month ago 12 1 0 0
Advertisement
Preview
BBC Radio 4 - More or Less, Has a company really discovered a million new species? Investigating whether Basecamp Research found hundreds of thousands of bacteria species

I got the chance to feature on this week’s BBC More or Less podcast with the excellent Tom Colls, talking about how scientists count life on Earth, specifically the microbes.
Have a listen: www.bbc.co.uk/programmes/p...

1 month ago 9 2 0 1

This is really interesting! I wonder how far this extends beyond isolates. MAG dynamics seems to be the (hidden) game changer with a diff regime: 1) many genomes/strains/... per single sample, 2) growing nb of MAG reconstructions per each sample, 3) many sci communities moving from isolates to mtgs.

1 month ago 3 0 0 0

I should have said, this takes us to about 2.8 million genomes in total. We don't have annotations, etc for the latest data yet, this will be an ongoing process

1 month ago 14 7 1 0
Overview — AllTheBacteria documentation

Courtesy of @martibartfast.bsky.social , we have a new release of AllTheBacteria which adds another 322,920 assemblies, covering all ENA (illumina, isolate) prokaryotes to May 2025.
allthebacteria.readthedocs.io/en/latest/ov...

1 month ago 65 29 0 3

How would you design a *multithreaded*, *concurrent* & *dynamic* hash table if you are focused specifically on common k-mer workloads, where streaming query & insertion are common? Jamshed, Prashant and I explore this in kache-hash, a cache-friendly k-mer hash table!
www.biorxiv.org/content/10.6...

2 months ago 20 13 0 0

🧵 New preprint! Our 4-lab team evolved Streptococcus pneumoniae in antibiotic-treated mice of varying immune states and discovered something surprising: bacteria rarely evolved resistance. Instead, they found a different way to survive — by rewiring RNA turnover.
🔗 www.biorxiv.org/content/10.6...

2 months ago 96 56 4 1
Preview
AlphaFold Database welcomes community datasets Latest AlphaFold Database update adds high-value datasets for microbial and viral proteins, generated by specialist communities

Delighted to see over 17 million new protein structure predictions from novel proteins in AllTheBacteria are now integrated into the AlphaFold Database at @ebi.embl.org !
Huge work from @gbouras13.bsky.social @oschwengers.bsky.social and friends to generate these.

www.ebi.ac.uk/about/news/u...

2 months ago 97 26 1 2

He may have only barely known about bacteria, and not at all about viruses, but Darwin was right about hating an ill-defined species concept

2 months ago 40 4 1 2

What's the best place to look up current estimates of how many truncated/non-functional genes each of us have? there was a paper from @dgmacarthur.bsky.social and co around 2014 that had an estimate from the 1000 genomes project (around 40 per person?), but I guess we have better estimates now.

2 months ago 5 5 2 0
Advertisement

We're also happy to see a second paper out today, led by Nicola de Maio, which develops methods to identify and account for mutation rate variation and recurrent errors.

www.nature.com/articles/s41...

2 months ago 13 5 2 0

At long last, my final PhD chapter is out: we developed a novel evolutionary simulator of bacterial pangenomes, Pansim, fitting it to data from >600K genomes using a likelihood-free framework, PopPUNK-mod, to explore neutral and adaptive pangenome dynamics www.biorxiv.org/content/10.6...

2 months ago 45 18 2 1

How do bacterial pangenomes evolve, what controls their dynamics, why do they exist?
Fitting a mechanistic model to 450 species from allthebacteria.org suggesting fast vs slow gene exchange (i.e. amount of MGEs) is a major differentiating factor, correlated with phylogeny rather than lifestyle

2 months ago 71 32 1 0

L3?

2 months ago 1 0 1 0
Post image Post image Post image Post image

A comprehensive survey of genome language models in #bioinformatics academic.oup.com/bib/article/... 🧬🖥️🧪

2 months ago 12 3 0 0
HLi Lab - Vacancies Openings

I am looking for a postdoc to develop high-performance algorithms in computational genomics. Email or DM me if interested. For more information, see hlilab.github.io/vacancies. RTs appreciated!

3 months ago 44 64 1 0
Preview
The evolution of mathematical software | Communications of the ACM Tracing how software and algorithms follow the hardware.

Just came across the 2021 Turing Lecture. Has a lot of nice observations regarding the increasing gap between compute and memory bandwidth. It advocates "communication avoiding" algorithms and notes how algorithms can only be future proof if they scale with threads.

dl.acm.org/doi/10.1145/...

3 months ago 9 3 1 0
Advertisement

But it has a mathematically well-defined center (a real point) – unlike all other cities.

3 months ago 2 0 0 0

Congratulations @baym.lol, @brinda.eu and colleagues on the nice work, looks like a great way to identify deletions and deletion-induced fusion genes.
In MTBC, genomic deletions called "regions of difference" have long been used for phylogenetic investigation. Yet I found no citations thereof.

3 months ago 8 2 2 0
Preview
GitHub - baymlab/deletion-born-fusion-manuscript Contribute to baymlab/deletion-born-fusion-manuscript development by creating an account on GitHub.

💻 github.com/baymlab/deletion-born-fusion-manuscript
🔧 github.com/aryakaul/prefixsuffix-kmer
Many thanks to co-authors @fernpizza.bsky.social , @brinda.eu & @baym.lol + GenScale/Baym lab! Funded by NIH, Packard, Pew, Sloan & a Chateaubriand Fellowship!

3 months ago 6 2 1 0

🎉 New year, NEW PREPRINT!

Bacteria exhibit astonishing genetic diversity, but where do new genes come from?

My best friend Arya Kaul (/labmate in the @baym lab) investigates how advantageous deletions can spawn new genes - "deletion-born fusions." 🧵:

3 months ago 49 30 1 2

New preprint from my lab (with Arya Kaul, @fernpizza.bsky.social, and @brinda.eu), in which we explore new genes hitchhiking on the beneficial deletion that fused them together, and find them in the LTEE, M. Tb/bovis, and across the bacterial tree of life

3 months ago 87 36 5 3

We're organising a microbes & deep learning session at SMBE next year -- looking forward to seeing your abstracts!

4 months ago 14 8 0 0