Chakravarthi Kanduri (@chakri) Bsky

AIRR-ML-25: Adaptive Immune Profiling Challenge Predict labels (e.g. disease, healthy) from sets of immune receptor sequences, and identify the sequences that explain the labels.

Ready to make your mark?
Accept the challenge 👇 🔗: kaggle.com/competitions...
#AIRR #Competition #DeepLearning #ComputationalBiology
@victorgreiff.bsky.social @chakri.bsky.social

5 months ago 2 1 0 0

📢 We are announcing the Adaptive Immune Profiling Challenge 2025!
Can you predict immune state labels from adaptive immune receptor repertoires?
💰 $10,000 prize pool!
🗓️ Launches Nov 5 on @kaggle.com
More Info: uio-bmi.github.io/adaptive_imm...

5 months ago 25 13 1 1

12/12🙏 Thanks to all collaborators & co-authors for useful inputs, brainstorming and perspectives: Maria Mamica, Emilie Willoch Olstad, Ingrid Hobæk Haff, @manuelazucknick.bsky.social , Jingyi Jessica Li,
& Geir Kjetil Sandve.

8 months ago 0 0 0 0

Beware of counter-intuitive levels of false discoveries in datasets with strong intra-correlations - Genome Biology The false discovery rate (FDR) controlling method by Benjamini and Hochberg (BH) is a popular choice in the omics fields. Here, we demonstrate that in datasets with a large degree of dependencies betw...

11/12 The bottom line: be aware of dependencies in your data! When false findings occur in highly correlated datasets, they can be numerous. Don't let your intuition fool you. Read the full open-access paper here: doi.org/10.1186/s130...

8 months ago 0 0 1 0

10/12 As a safer alternative, consider the Benjamini-Yekutieli (BY) method when you can tolerate a bit more type II error. It doesn't completely eliminate the issue but makes these large false positive events much less frequent and severe (a good compromise between BH and FWER).

8 months ago 0 0 1 0

9/12 Use negative controls/synthetic null data and other diagnostic checks as recommended in the article to identify and minimize caveats. If continuing to use BH method — try to know its assumptions and formal guarantees to ensure correct interpretation of the findings.

8 months ago 0 0 1 0

8/12 Issues like broken test assumptions, study biases, or the researcher’s flexibility in analyzing the data can make this problem even worse. So, what can you do? We suggest a few key strategies:

8 months ago 0 0 1 0

7/12 This statistical artefact can lead researchers to incorrectly conclude the existence of an underlying biological mechanism, which might even form the main conclusion of their study.

8 months ago 0 0 1 0

6/12 It feels intuitive to believe that if hundreds or thousands of features are flagged as significant, at least some of them must be real. However, we show this intuition can be wrong; it's possible that every single finding is false.

8 months ago 0 0 1 0

5/12 A Counter-Intuitive Trap: Using real-world and simulated data (methylation, gene expression, metabolite and eQTL analyses), we found this phenomenon to be persistent. The primary danger is that researchers may be misled by the sheer volume of these false findings.

8 months ago 1 0 1 0

4/12 This happens because dependencies in the data can cause many features to falsely appear significant together. While the overall FDR is controlled (e.g., <5% of experiments have errors), the experiments that do have errors can have thousands of them.

8 months ago 0 0 1 0

3/12 Even when a study has no true biological signal (all null hypotheses are true), the BH method can occasionally generate thousands of statistically "significant" findings.

8 months ago 0 0 1 0

2/12 The widely used False Discovery Rate (FDR) control method, Benjamini-Hochberg (BH), is a staple in omics research. But when analysing datasets with dependencies between features (like gene expression, methylation, metabolites, QTL analyses ++), it can behave unexpectedly.

8 months ago 0 0 1 0

Beware of counter-intuitive levels of false discoveries in datasets with strong intra-correlations - Genome Biology The false discovery rate (FDR) controlling method by Benjamini and Hochberg (BH) is a popular choice in the omics fields. Here, we demonstrate that in datasets with a large degree of dependencies betw...

1/12 🧵 Quick thread: a heads-up for anyone working with high-dimensional omics data and multiple testing — FDR correction can sometimes lie to you. A thread based on our recent publication in @GenomeBiology : doi.org/10.1186/s130...

8 months ago 1 0 1 0

Happy to share our new collaborative work! 🚨 We analyzed 2250 TCR repertoires to uncover how HLA risk alleles shape immune autoreactivity in T1D. T1D-specific HLA-motifs were also validated in pancreatic lymph nodes (pLN) & spleen. 🧬🧵 #T1D #Immunology #TCR 1/9

1 year ago 36 9 1 1

Machine learning on immune receptors/repertoires is an exponentially expanding field, but labeled data to benchmark ML methods are missing. We address this need with our new simulation framework LIgO. www.biorxiv.org/content/10.1...
Led by @mchernigovskaya.bsky.social and Milena Pavlović.

2 years ago 3 5 1 0

Posts by Chakravarthi Kanduri