Posts by Yana Bromberg
Yana Bromberg at the Woodstock Night Science conference #TCTEAC @yanabromberg.bsky.social
Protein RNS scores differ from those of randomized sequences and, also, across protein subsets (structure vs disorder vs design)
While our focus is on protein embeddings, the same intuition applies to any domain using latent representations, from medical imaging to physics.
We are looking forward to hearing how your models improve with RNS!
Main findings:
• Variant effect prediction accuracy jumped from ~60% to ~90% for low vs high-reliability embeddings.
• Hundreds to thousands of human proteins, per model, may be poorly captured.
• Our score flags low-reliability embeddings, guiding better model training and downstream fine-tuning.
I’m super excited to share our work (with Prabakaran Ramakrishnan) on scoring embedding reliability. We propose the RNS (random neighbors) score that improves the next steps in model use, e.g. variant effect prediction, structure modeling, function annotation, etc.
www.biorxiv.org/content/10.1...
Are you using AI models on protein or DNA sequences? Did you maybe forget that their embeddings come with no obvious measure of reliability?
Imagine how sequence analysis would suffer if source sequences were sometimes subtly randomized. This is exactly what happens if we use low-quality embeddings.
Photo of a slide titled "Hot Tub Hypotheses" - featuring a photo of the conference hot tub and our Ten Simple Rules for Drawing Scientific Comics paper
Shout out to Hot Tub concocted papers at #psb2025! Our Ten Simple Rules for Scientific Comics arose from a hot tub hypothesis with @yanabromberg.bsky.social - "Hey, I bet we could write a paper on scientific comics!" @pacsym.bio
Do you have any 'hot tub' papers?
journals.plos.org/ploscompbiol...
Love all of it! The big question is are you on this side of the pond yet?
Association of polygenic scores for neuropsychiatric traits with self-reported professions based on analysis of 420k individuals from UK Biobank and Million Veteran Program. Look at the 'arts & design' category. Artistic talent comes at a cost--a piece of your mind :)
Excited to share our latest work in #metagenome analysis. We built REMME/REBEAN to analyze and label sequencing #reads.
End result? We can label activity of #microbiome #proteins with minimal homology to #enzymes we've seen before.
Great work by R. Prabakaran!
www.biorxiv.org/content/10.1...