Arun Das (@arun-das) Bsky

Congratulations! Vikram Shivakumar successfully defended his dissertation “Scalable Sequence Analysis Using Compressed Pangenome Indexing” under the guidance of advisor Ben Langmead. Vikram plans to pursue a joint postdoctoral fellowship at the European Bioinformatics Institute, the Wellcome Sanger Institute, and the University of Cambridge. We in the department are extremely proud of our students who have successfully completed their PhD. Congratulations on this achievement and best wishes as you begin an exciting new phase of life!

Congratulations, @vikramshivakumar.bsky.social!

1 month ago 15 5 0 1

1/ Excited to share my first first-author preprint from my PhD!

We introduce Perseus, a lineage-aware confidence estimation framework for taxonomic classification in long-read metagenomics.

Preprint: www.biorxiv.org/content/10.6...
Code: github.com/matnguyen/Pe...

1 month ago 14 8 1 0

If they are doing this to white people in broad daylight imagine what they are doing to non-white people in detention facilities.

2 months ago 6776 1940 65 38

Genetic Data From Over 20,000 U.S. Children Misused for ‘Race Science’

🚨🚨🚨 "At least 63 times since 2007, data from some of the 28 human genomic repositories that the N.I.H. controls was improperly released to researchers, used for unapproved purposes or made vulnerable to theft..." (Gift Link) www.nytimes.com/2026/01/24/u...

2 months ago 359 207 4 32

Common variation in meiosis genes shapes human recombination and aneuploidy - Nature Analysis of data from pre-implantation genetic testing sheds light on the genetic basis of meiotic-origin aneuploidy, the leading cause of human pregnancy loss, identifying common genetic variants ass...

Pregnancy loss is common in humans, and chromosomal abnormalities are the leading cause. Using genetic data from ~140,000 IVF embryos, we show that maternal variation in meiosis genes influences recombination and aneuploidy risk.

First authors: @saracarioscia.bsky.social & @aabiddanda.github.io

3 months ago 121 55 1 5

Appalling. This whole situation has been handled so poorly, without any care for the community at Brown, the community and people that live in and around the university, and the larger Providence community.

4 months ago 3 0 1 0

Some of the best moments of my life were on that exact block.

I lived across the street from where this has happened. We spent hundreds and hundreds of hours in those buildings. I met the most important people in my life there.

I can’t imagine what the people there are going through now.

4 months ago 2 0 0 0

Just awful, awful news.

Sending all my love and thoughts to those affected, and to the many on campus who never should have had to experience this.

4 months ago 3 0 1 0

DNA scientist James Watson has a remarkably long history of sexist, racist public comments “People say it would be terrible if we made all girls pretty,” he said in 2003. “I think it would be great.”

Hey folks, as news of Watson's demise spreads, please don't set aside his weighty legacy of misogyny and racism. He was truly among the worst of us. www.vox.com/2019/1/15/18...

5 months ago 2316 875 89 210

Johns Hopkins researchers to present at Genome Informatics 2025 Students from the Department of Computer Science will give talks and present posters on their research in genome informatics.

Check out the cool work being presented by our students at Genome Informatics! This year’s event was co-organized by @benlangmead.bsky.social, & features talks from @vikramshivakumar.bsky.social & more, plus posters from @alexsweeten.bsky.social, @maojanlin.bsky.social, & @sinamajidian.bsky.social:

5 months ago 5 3 0 2

Figure 1: (A) Anchor-based merging requires a common sequence (red) present in each partition. Multi-MUMs are merged by identifying overlaps between partition-specific matches in the anchor coordinate space, and a uniqueness threshold determines if a MUM is still unique in each partition after truncation. (B) String-based merging enables compu- tation of multi-MUMs between partitions without a common sequence. An example tree (left) is shown, highlighting the use case where partial multi-MUMs specific to internal nodes (starred) can be computed by merging subclade-based partitions up a tree. (right) MUM overlaps are computed by running Mumemto on the MUM sequences, and the uniqueness threshold array ensures overlaps remain unique across the merged dataset. (C) An example Burrows-Wheeler Transform (BWT), matrix (BWM), and Longest Com- mon Prefix (LCP) array, with sequence IDs for each suffix shown (ID). A non-maximal unique match (UM) is shown, and the uniqueness threshold for this match is found us- ing the flanking LCP values. (D) A partial multi-MUM (in blue) is found in all-but-one sequence (excluded in red). Using two anchor sequences (red and orange), all-but-one partial MUMs can be computed using an augmented anchor-based merging method (sec- tion 2.6).

Fantastic talk by @vikramshivakumar.bsky.social Mumemto—Scalable multi-MUM finding for pangenomes
Papers biorxiv.org/content/10.1101/2025.05.20.654611 & doi.org/10.1186/s13059-025-03644-0
Code: github.com/vikshiv/mume...
Very efficient pangenome visualization tool, revealing synteny and variations!

5 months ago 23 12 1 1

The human cost of the post 9/11 wars on people in Afghanistan, Pakistan, Iraq, Syria and Yemen. Between indirect and direct deaths, 4.5-4.7 million people have died, and tens of millions have been displaced.

Today, for no particular reason at all, it is worth sharing this, as a reminder of what one man's lies can do.

Taken from this resource from my alma mater: costsofwar.watson.brown.edu

(Specific page is: costsofwar.watson.brown.edu/costs/human/...)

5 months ago 0 0 0 0

A complete diploid human genome benchmark for personalized genomics Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and ...

Delighted to finally announce a preprint describing the Q100 project! “A complete diploid human genome benchmark for personalized genomics” For which we finished HG002 to near-perfect accuracy: www.biorxiv.org/content/10.1... 🧵[1/14]

6 months ago 97 57 4 4

I just cannot see academic institutions, medical facilities and so many other employers stumping up $100K.

It’s disheartening, both to see this happen and to see how few people who rely on and employ individuals on H-1B visas are willing to speak up and address this. Your silence is deafening.

7 months ago 2 0 0 0

Make no mistake, this remains catastrophic.

It makes it near impossible for people to stay and work in the US, no matter how qualified they are, unless they work for a handful of extremely wealthy companies.

For so many of us, it’s time to make other plans.

7 months ago 2 1 1 0

Small victories, but this doesn’t seem to apply to those currently on an H-1B visa.

Wish it was made clear in the initial “proclamation”, before we spent the entire day panicking while trying to figure out a way to get a friend back to the US before midnight.

7 months ago 2 0 2 0

This is catastrophic.

So, so many people I know and love are going to find it impossible to stay and work in the US, and it makes it almost impossible for people like me to stay and work here in the longer term, no matter how qualified we are.

7 months ago 0 0 0 1

New blog post – A quick look at Roche's SBX
lh3.github.io/2025/09/11/a...

7 months ago 57 30 2 3

The concessions made by Brown endanger so many members of our community on campus, limit the access to higher education for individuals from underrepresented backgrounds, and undermine the serious conversations about major issues on campus.

Just so disappointing to see them go down without a fight.

8 months ago 1 0 0 0

Brown University Makes a Deal With the White House to Restore Funding

Brown have agreed to do all this without any sort of legal challenge, and have agreed to these terms without consulting their staff, students or alumni.

Here's a link to the article that should bypass the paywall, if you wanted to read it for yourself. www.nytimes.com/2025/07/30/u...

8 months ago 1 0 1 0

In short, all funding is restored and active cases are dismissed in exchange for a $50M commitment to state work force development, new compliance with the administration's discriminatory policies on transgender individuals, a slew of "anti-DEI" admissions policies. No admission of any wrongdoing.

8 months ago 1 0 1 0

Extremely disappointed in my alma mater, who have chosen to fold without a fight and endanger the most vulnerable members of our community instead of standing up for them.

Spineless and shameful.

8 months ago 7 2 1 0

Beyond the Human Genome Project: The Age of Complete Human Genome Sequences and Pangenome References | Annual Reviews The Human Genome Project was an enormous accomplishment, providing a foundation for countless explorations into the genetics and genomics of the human species. Yet for many years, the human genome ref...

Getting into computational biology this summer? 🏖️ 📖 Check out “Beyond the Human Genome Project: The Age of Complete Human Genome Sequences and Pangenome References” by @arun-das.bsky.social‬, @mikeschatz.bsky.social, and more for a great introduction to the field:

9 months ago 20 5 0 0

Partitioned Multi-MUM finding for scalable pangenomics Pangenome collections are growing to hundreds of high-quality genomes. This necessitates scalable methods for constructing pangenome alignments that can incorporate newly-sequenced assemblies. We prev...

Excited to share a new update to Mumemto, scaling MUM and conserved element finding to any size pangenome! Preprint out now w/ @benlangmead.bsky.social.
Mumemto scales to the new HPRC v2 release and beyond, and can merge in future assemblies without any recomputation! 1/n

10 months ago 27 15 1 2

Easily the most important thing happening next week.

Come and watch my friend Sara defend her PhD!

11 months ago 5 1 0 0

Thank you! It was awesome to talk to you too, and to learn about all the cool data and insights from your project!

11 months ago 0 0 0 0

Really cool work from @arun-das.bsky.social on recovering sequence from unmapped reads (even with T2T reference or HPRC pangenomes!). Can recover a decent amount of sequence per individual using these approaches. Check it out!

11 months ago 4 3 0 0

@arun-das.bsky.social's thesis research demonstrates that short-read mapping-based approaches, even using complete linear (T2T-CHM13) and pangenome (HPRC) references, miss a lot of variation that can be recovered from unmapped reads.

11 months ago 6 1 0 0

GitHub - arun96/SouthAsianGenomeDiversity: Repository containing all links and materials relevant to our work on investigating the diversity in and the representation of South Asians in genomic datase... Repository containing all links and materials relevant to our work on investigating the diversity in and the representation of South Asians in genomic datasets. - GitHub - arun96/SouthAsianGenomeD...

The homepage for this work is here: github.com/arun96/South...

This analysis can be replicated on any population of your choosing, and the WDL scripts used to run the various stages (as well as any other analysis details) can all be found on that page and in our pre-print.

11 months ago 1 0 0 0

Posts by Arun Das