Become a colleague in my department!
Applications close in four days, December 15.
Posts by Claire McWhite
Alignment of each motif concept peaks at the position of that motif in the protein. We also detect a few motifs absent from individual databases, though these are typically annotated in other databases. 4/4
Building on the idea of Concept Activation Vectors from arxiv.org/pdf/1711.11279 3/4
We take embeddings of protein fragments w/ and w/o a motif, train a simple linear classifier, and use the normal vector to the decision boundary as the “motif direction.” So for motif detection, all you need is a dictionary of learned motif concept vectors, and a PLM to embed the protein with. 2/4
Alignment profiles of several motifs scanning across a protein. Each line represents a concept activation vector from a different layer.
Vision models have directions in embedding space for concepts like “stripes” or "corgi"
We show that protein language has directions for motifs, use this as a new way to detect and localize motifs!
New preprint w/@ahmadshamail.bsky.social
Feedback very welcome arxiv.org/abs/2511.21614 1/4