Happy the share our latest perspective on T-cell epitope recognition predictions for cancer immunotherapy: rdcu.be/fdKu6
Posts by David Gfeller
A big thank you to Harari group for their help with the experiments, and to Dana Moreno, Yan Liu, Julien and Giancarlo Croce for the contributions to the data analyses.
These results are fully consistent with our recent observation that a tool treating independently each TCR chain reaches similar or better prediction accuracy compared to other, often much more complex, TCR-epitope interaction predictors (see TEMPO: github.com/GfellerLab/T...).
Applications of these results to unseen epitopes in the IMMREP competition show how we can generate training data for a cost of ~400$ per epitope with the SEQTR protocol [Genolet et al. pubmed.ncbi.nlm.nih.gov/37159666/], instead of the ~2000$ required for single-cell TCR-seq.
Our results across several benchmarks and for hundreds of epitopes demonstrate that disregarding the pairing between TCR chains in the training of TCR-epitope recognition predictors has no impact on their performance, and that predictors can be trained on unpaired TCRb + TCRa sequences.
Here we address the question of how much information is encoded in the actual pairing between TCR chains, and whether unpaired TCRa + TCRb sequencing of epitope-specific T cells could be used for training predictors of TCR-epitope recognition.
Studies by us [Liu et al, www.biorxiv.org/content/10.1... and others [Springer et al. pubmed.ncbi.nlm.nih.gov/33981311/] have demonstrated that the TCR alpha and TCR beta chains play a central (and on average equal) role in epitope recognition.
T-cell recognition of infected or cancer cells is elicited by the binding of heterodimeric TCRs to antigenic peptides displayed on MHC molecules.
Happy to share the work of Aisha Shah about training predictors of TCR-epitope recognition with unpaired TCRa + TCRb sequences: www.biorxiv.org/content/10.6....
Kudos to @dtadros.bsky.social for developing this website
In our hands, the TCR Specificity Profile framework (www.biorxiv.org/content/10.1101/2025.11.17.688817v1) is not only central for understanding the key determinants of TCR-epitope recognition, but also to assess quality and reproducibility in new data.
Interested in understanding how epitope recognition specificity is encoded in your TCR sequences? Try our new interactive page for building TCR specificity profiles on the TCR Motif Atlas: tcrmotifatlas.unil.ch/Building_mot....
Thanks to precious help from @benoitduc.bsky.social and @johannajoyce.bsky.social, the proposed markers for this population could be validated, thereby paving the way to further functional characterizations.
Applying this approach to large multimodal atlases from blood and tumor samples identifies interferon-primed monocytes and macrophages in the circulation and in the tumor microenvironment
With SuperCell2.0, Leonard Herault shows how multimodal metacells can help reduce the size and sparsity of single-cell multiomic datasets, while preserving, and even sometimes enhancing, the biological information.
Single-cell multiomic datasets are getting larger and larger in numbers of cells, but the measurements for each cell remain as sparse as 10 years ago...
Happy to share the beautiful manuscript of Leonard Herault: www.biorxiv.org/content/10.6...
The fact that most CDR3 residues are determined by V/J choices (including many that directly contact the epitope in crystal structures) is fully consistent with our recent observation that epitope recognition specificity is primarily encoded in V/J usage (www.biorxiv.org/content/10.1...).
We then leveraged these results for quality assessment of TCR repertoire data by monitoring inconsistencies between CDR3 and V/J gene annotation. This analysis revealed different sources of noise in repertoires of TCRs of both undetermined and known specificity.
Similarly, batch effects resulting in different V/J gene usage across TCR repertoires will lead to apparent enrichment in specific CDR3 motifs or k-mers.
This has important consequences for interpreting CDR3 sequence patterns. For instance, any constrain in CDR1/CDR2 residues (e.g., recognition of a specific epitope) will impact multiple residues in CDR3 sequences.
Our results demonstrate that CDR3 length is strongly influenced by the number of germline-encoded CDR3 residues in V and J genes, and that on average 80% of CDR3α and 65% of CDR3β residues are determined by V and J gene usage.
Here we precisely quantify the impact of V/J choices on CDR3 length and amino acid composition.
CDR3 loops are known to mediate important interactions with the epitope. For this reason, many (most?) studies have focused on CDR3 sequences when analyzing TCR repertoires.
TCR repertoires are characterized by a very high sequence diversity resulting from different choices of V (D) and J genes and rearrangements taking place at the V(D)J junction within the CDR3 loops.
Very proud of the work of Dana Moreno about statistical modelling of CDR3 sequences in TCR repertoires: www.biorxiv.org/content/10.6....
Nice results: rdcu.be/eTNVr. Looks like the decision to develop specific ML models of TCR-epitope interactions for each epitope was a reasonable choice.
For those who followed the IMMREP25 competition and are interested in understanding how specificity was encoded for each epitope, check our TCR Motif Atlas IMMREP25 page: tcrmotifatlas.unil.ch/Browse_epito...
A huge thank you to all collaborators, including Yan Liu, Giancarlo Croce, Dana Moreno, Daniel Tadros, Julien Racle, Anne-Christine Thierry, Petra Baumgartner, Alexandra Michel, Vincent Zoete and Alexandre Harari