I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Posts by Ming Tommy Tang
6. [R Graphics Cookbook](www.cookbook-r.com/Graphs/) by Winston Chang.
3. data visualization resources sabahzero.github.io/dataviz/res...
1. Nature Methods point of view data visualization blogs.nature.com/methagora/2... the columns on color mapping and heatmap are very nice.
Data visualization is a critical step in data analysis. 8 links to bookmark for better data visualization:🧵
OmicClaw: executable and reproducible natural-language multi-omics analysis over the unified OmicVerse ecosystem www.biorxiv.org/content/10....
I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
You’re not just analyzing data.
You’re cleaning up after biology.
And sometimes, after people.
That’s the job.
And it matters.
15/
Action items:
Double-check your metadata
Validate sample swaps with NGSCheckMate or Somalier
Never assume your data is clean
Talk to the wet lab
14/
Key takeaways:
Expect human error
PCA is a sanity tool, not just a plot
Use SNP-based tools to verify identity
Always question your input
13/
Bioinformatics isn’t just coding.
It’s detective work.
And your best tools are skepticism and common sense.
12/
These tools compare genetic fingerprints.
If your RNA and DNA don’t match, you’ll catch the swap before it's too late.
11/
Or try Somalier by Brent Pedersen from Aaron Quinlan’s lab (yes, that bedtools legend).
github.com/brentp/soma...
10/
Use genotyping to confirm sample identity.
I’ve used NGSCheckMate:
It matches RNA and DNA using SNP profiles.
github.com/parklab/NGS...
9/
And what about multi-omics?
You assume RNA-seq and DNA-seq came from the same sample?
Be careful.
8/
If controls and knockdowns overlap in PCA?
That’s not subtle biology.
That’s a red flag.
7/
Got 3 control vs 3 knockdown?
Check the gene you silenced.
If it’s not downregulated in the knockdowns—stop.
6/
So how do you fix it?
Use your brain.
Sanity check everything.
Never trust metadata blindly.
5/
Even gold-standard datasets like TCGA have sample swaps.
Human error doesn’t disappear at scale. It is easier to make mistakes when you have thousands of samples.
4/
You stare at that PCA.
Two knockdowns look like controls.
Three controls are drifting off axis.
The data isn't lying. It’s telling you something’s off.
3/
Mislabeling happens.
Cells get mixed.
Samples are swapped.
And your PCA plot?
Suddenly it makes no sense.
2/
Wet lab scientists are not spreadsheets.
They pipette, label, freeze, and extract.
Sometimes in a rush.
Sometimes while tired
1/ Biological data isn’t just messy.
Humans generate it.
And humans make mistakes.
As a bioinformatician, this will be your reality 🧵