Do you have any thoughts on ipsae_min/max when using Boltz? This figure was interesting bc for boltz, ipsae_max is better. I think min makes more intuitive sense but not sure what to make of the results here.
I would love to use AF3 ipsae_min, but the AF3 daily limits make it impractical.
Posts by
Plot that shows an increasing number of papers about de novo protein design
We’re getting more and more papers on de novo protein design, but I still don’t really know what it means
What’s your definition of de novo design?
We've updated the Raygun preprint with additional validations (more fluorescence assays, biotin ligase reengineering, EGF optimization etc.) Take a look!
Here's the preprint: www.biorxiv.org/content/10.1...
Congratulations to @youngsuko9.bsky.social on their top-10 placement in Adaptyv Bio's EGFR-binder competition!
It's an elegant pipeline-- using our protein-design method Raygun to expand and modify the template (EGF), then filtering with ProTrek.
foundry.adaptyvbio.com/competition
1/
This is really horrific. These single cell reference atlases are widely used as is to train all kinds of models! This is one of the reasons I've constantly harping about uniform reprocessing & extremely careful QC of large atlases. 1/
@jeremyparkeryang.bsky.social
Don’t let the v*rtual cell mafia see this
So OGT can act as a proxy for thermostability. I feel like there’s a lot of potential in trying to find viable proxies for other important properties. But finding these proxies seems non-trivial.
I prefer mean-pooling bc saves a lot of disk space and felt faster to train. But I realized I mean-pool partially because I noticed that was the status-quo, without really asking why.
So, I have just been trying to figure out when and how using the [seq, dim] shaped embeddings can be good.
Does that still apply if you don't flatten the representation from the start? Like in this paper where the authors use [dim, seq len] embeddings as inputs to a model using 1d conv and attention mechanisms, and argue it can extract more info than mean-pool.
academic.oup.com/bioinformati...
The paper mentions that embeddings need to be compressed for most downstream tasks.
If you aren’t concerned about the computational requirements, would you expect using the uncompressed embeddings as input to a model to be better than using compressed?
Super excited to share our review on genomic deep learning models for non-coding variant effect prediction, with Ayesha Bajwa and Nilah Ioannidis. We’d like this review to be a useful resource, and welcome any feedback, comments, or questions! 1/4
arxiv.org/abs/2411.11158
Hello 🦋 #protein / #microbio / #BioML community! We are excited to release Gaia🌎, a context-aware protein search tool, extending protein search and discovery capabilities beyond sequence and structure, to include *genomic context*. Search your favorite protein sequences with on gaia.tatta.bio