Advertisement · 728 × 90

Posts by Frederick "Erick" Matsen

Post image

Looking forward to the next AIRR immune repertoire meeting tinyurl.com/airrcmeeting8 . Hope to see you there!

(Note there is a hybrid option too.)

1 month ago 4 1 0 0
Preview
Spec-Driven Development with spec-kit A walkthrough of using GitHub's spec-kit for spec-driven development on a bioinformatics pipeline.

Spec-kit points to the future of software development. New post: when I use it, how it's changed my dev flow overall, and a complete walkthrough building a bioinformatics pipeline from scratch. matsen.group/general/202...

1 month ago 8 0 0 0

We had top-notch collaborators. Tagging the ones I can find on bsky: @vmminin.bsky.social @yun-s-song.bsky.social @jbloomlab.bsky.social @tylerstarr.bsky.social @psathyrella.bsky.social @thearmita.bsky.social

4 months ago 1 0 0 0
Preview
Replaying evolution to learn about the fitness landscape of affinity maturation A five year collaboration with the Victora lab is bearing fruit for evolutionary biology.

Over the past 5+ years I've had the honor of working with @wsdewitt.github.io @victora.bsky.social and many others on a project to "replay" affinity maturation evolution from a fixed starting point.

matsen.group/general/2025...

4 months ago 30 18 2 1
Agentic Coding for Scientists (Dec 2025 edition)
Agentic Coding for Scientists (Dec 2025 edition) YouTube video by Erick Matsen

Thanks to everyone who attended and asked questions in www.youtube.com/watch?v=Rbhs... . I've added it to the blog series description matsen.group/agentic.html

I'm going to stop barking about AI for a while now. The next blog post will be about B cells!

4 months ago 2 0 0 0

I'll be livestreaming in 24 hours. Hope to see you there!

4 months ago 4 0 0 0
Preview
GitHub - matsengrp/olmsted: B-cell repertoire and clonal family tree explorer B-cell repertoire and clonal family tree explorer. Contribute to matsengrp/olmsted development by creating an account on GitHub.

We welcome any experience you have or extension requests at github.com/matsengrp/ol... .

We are currently working on a version that can display heavy and light chain sequences.

4 months ago 1 0 0 0
Video

Thanks to Dave Rich in our group, our repertoire browser www.olmstedviz.org is now greatly updated.

* Data is loaded into your browser client-side: no install
* Interactive visualization of trees and amino acid mutations
* Tool ingests data in AIRR JSON format

We want to help you try it out!

4 months ago 3 0 1 0
Agentic Coding for Scientists (Dec 2025 edition)
Agentic Coding for Scientists (Dec 2025 edition) YouTube video by Erick Matsen

The era of coding agents is here. How do we approach this as scientists?

Wednesday Dec 10th at 9am PT I'll livestream an interactive demo of what I have learned (matsen.group/agentic.html) about how to leverage agentic coding to do rigorous science.

www.youtube.com/watch?v=Rbhs...

4 months ago 15 5 1 3

Outstanding articles. I’ve only just begun experimenting with Claude and Gemini CLIs and they’re incredibly powerful. This is extremely valuable best practice advice.

5 months ago 0 1 0 0
Advertisement
Preview
Agentic Coding For Scientists A four-part series on using coding agents like Claude Code for scientific programming, covering fundamentals, workflows, best practices, and the human side of AI-assisted development.

The last five months with Claude Code have completely changed how we work.

matsen.group/agentic.html details:

• How agents work (& why it matters)
• Git Flow with agents
• Using agents for science
• The human-agent interface

Questions? What has your experience been?

5 months ago 14 6 2 1
Preview
Herbold Computational Biology Faculty & Labs

The Mahan postdoctoral fellowship offers 21 months of support to develop your own research with Fred Hutch computational biology faculty-- lots of excellent labs to choose from!
Apply: apply.interfolio.com/172697
Faculty: www.fredhutch.org/en/research...

6 months ago 10 3 0 0
Post image

... and second is to have a map from the figures to where they are made in the associated "experiments" code repository (github.com/matsengrp/dn...):

6 months ago 2 0 0 0
Post image

I forgot to post two things I liked doing in this paper that I hope catch on. First is to have links in the methods section to the model fitting code (in a tagged version github.com/matsengrp/ne... as the code continues to evolve):

6 months ago 1 0 1 0
Post image

Oh, and here is a picture of a cyborg-Darwin (cooked up by Gemini), after he realized how useful transformers are. For some reason MBE didn't want it as a cover image!

6 months ago 0 0 1 0

Many thanks to Kevin Sung and Mackenzie Johnson for leading the all-important task of data prep, Will Dumm for code and methods contributions, David Rich for structural work, and Tyler Starr, Yun Song, Phil Bradley, Julia Fukuyama, and Hugh Haddox for conceptual help.

6 months ago 0 0 1 0

We have positioned our group in this niche: we want to answer biological questions using ML-supercharged versions of the methods that scientists have been using for decades to derive insight.

More in this theme to come!

6 months ago 0 0 1 0
Advertisement

Stepping back, I think that transformers and their ilk have so much to offer fields like molecular evolution. Now we can parameterize statistical models using a sequence as an input!

6 months ago 0 0 1 0
Preview
netam/notebooks/dnsm_demo.ipynb at main · matsengrp/netam Neural networks to model BCR affinity maturation. Contribute to matsengrp/netam development by creating an account on GitHub.

If you want to give it a try, we have made it available using a simple `pretrained` interface. Here is a demo notebook. github.com/matsengrp/n...

6 months ago 0 0 1 0
Post image

And because natural selection is predicted for individual sequences, we can also investigate changes in selection strength as a sequence evolves down a tree:

6 months ago 0 0 1 0
Post image

Because this model isn't constrained to work with a fixed-width multiple sequence alignment we can do things like look at per-site selection factors on sequences with varying CDR3 length:

6 months ago 1 0 1 0

If a selection factor at a given site for a given sequence is

• > 1 that is diversifying selection
• = 1 that is neutral selection
• < 1 that is purifying selection.

6 months ago 0 0 1 0

The model is above. In many ways it is like a classical model of mutation and selection, but the mutation model is a convolutional model and the selection model is a transformer-encoder mapping from AA sequences to a vector of selection factors of the same length as the sequence.

6 months ago 0 0 1 0
Post image

The final version of our transformer-based model of natural selection has come out in MBE. I hope some molecular evolution researchers find this interesting & useful as a way to express richer models of natural selection. doi.org/10.1093/mol... (short 🧵)

6 months ago 31 7 1 0
Advertisement
Preview
AI Engineer - Evolutionary Protein Language Models Primary Work Address: 19700 Helix Drive, Ashburn, VA, 20147 Current HHMI Employees, click here to apply via your Workday account. Intro: AI@HHMI: HHMI is investing $500 million over the next 10 years ...

We are looking for an #AIEngineer to help build protein language models that capture evolutionary constraints with @matsen.bsky.social and @jbloomlab.bsky.social at #AI@HHMI @hhmijanelia.bsky.social
hhmi.wd1.myworkdayjobs.com/en-US/Extern...

7 months ago 11 6 0 0

Hats off to first author Kevin Sung www.linkedin.com/in/kevinsun... and the rest of the team 🙏 !

7 months ago 0 0 0 0
Preview
Peer review in Thrifty wide-context models of B cell receptor somatic hypermutation Convolutional embedding models efficiently capture wide sequence context in antibody somatic hypermutation, avoiding exponential k-mer parameter scaling and eliminating the need for per-site modeling.

I was very proud to get "The authors are to be commended for their efforts to communicate with the developers of previous models and use the strongest possible versions of those in their current evaluation" in peer reviews:

elifesciences.org/articles/10...

7 months ago 0 0 1 0
Preview
GitHub - matsengrp/thrifty-experiments-1 Contribute to matsengrp/thrifty-experiments-1 development by creating an account on GitHub.

Pretrained models are available at github.com/matsengrp/n..., and the computational experiments are at github.com/matsengrp/t....

7 months ago 0 0 1 0

It's possible that more complex models not more significantly dominating comes from a lack of suitable training data, namely neutrally evolving out-of-frame sequences. We tried to augment the training data, with no luck.

7 months ago 1 0 1 0

The resulting models are better than 5-mer models, but only modestly so. We made many efforts to include a per-site rate but concluded that the effects of such a rate were weak enough that including them did not improve model performance.

7 months ago 0 0 1 0