Igor Martayan (@imartayan) Bsky

Turns out that the usual NtHash is not as random as one might think?!?! At least not for minimizers.

Seq-hash (and simd-minimizers) already has this fixed by default ;)

github.com/rust-seq/seq...

6 hours ago 8 4 0 0

Breaking ntHash (to better fix it) NtHash is a popular method for hashing k-mers in bioinformatics, yet it has some surprising flaws. In this post, I walk through a few of them, and show that they can arise naturally, without an advers...

New blog post!

I use ntHash all the time to hash k-mers, yet it turns out it has some unexpected flaws (collision propagation, bias on leading zeros...). The good news: each of them can be fixed!

igor.martayan.org/posts/breaki...

12 hours ago 18 8 0 2

The fastest way to match characters on ARM processors? lemire.me/blog/2026/04/19/the-fast...

2 days ago 9 1 0 0

I am starting this post off right in the middle, with a paragraph that comes later:

I think this blog post comes closest to my current thinking on AI than any other I've read: fransskarman.com/im_not_using...

4 days ago 14 3 1 2

Not nearly as polished, but I'm currently writing my thesis and it overlaps many of these topics

5 days ago 3 2 1 0

@ksahlin.bsky.social any news about the Recomb-Seq reviews?

1 week ago 2 1 0 0

I'm not looking forward to a future where all the tools are being vibe-rewritten into languages people don't want to learn. Who will maintain all this? The original maintainers won't. It's not the language they were comfortable with. Does the prompter understand the tool well enough?

1 week ago 20 6 3 1

I could also add some postprocessing to replace @misc by @article for arxiv outputs.

1 week ago 0 0 0 0

Automatically dereference DOI to .bib? Is there an established way to, given a list of DOI identifiers, produce a .bib file containing the citation information?

The original bibtex is actually produced by doi.org itself (see tex.stackexchange.com/questions/68...) but I do some postprocessing to handle special characters and do small modifications.

1 week ago 0 0 1 0

GitHub - imartayan/bibelot: A command-line tool adapted from doi2bib to fetch BibTeX entries from DOIs and more A command-line tool adapted from doi2bib to fetch BibTeX entries from DOIs and more - imartayan/bibelot

I kept being rate limited by doi2bib, so I made a small CLI to replace it locally: github.com/imartayan/bi...

It can output bibtex from DOIs and arxiv/biorxiv links, redirect to the published version when it's available and copy the result to your clipboard

1 week ago 16 6 1 0

Modern DRAM is based on a brilliant design from IBM.

But, we're still paying for a latency penalty that's existed since the 60s!

In this video, I'm introducing my research project (Tailslayer) that immensely reduces p99.99 latency on traditional RAM!

2 weeks ago 185 40 3 7

Yeah; I have a bunch of incoherent thoughts about this...

1) maintaining bindings is annoying and purely a service. I already have the rust code and don't need bindings myself.
2) somebody needs to own them. Typically the main dev (me), even though someone else is the user and expert.
(1/10)

3 weeks ago 7 2 1 0

2. I don't mind if people don't want to use Rust but instead of rewriting it from scratch every single time, why don't you write bindings and make them available for everyone? This is both easier and more maintainable, and would actually be helpful for the community.

3 weeks ago 38 3 1 1

1. We keep improving these libraries, fixing bugs and adding new features regularly. Your code will likely be outdated in a few months.

3 weeks ago 19 0 1 0

A quick rant on people vibe-translating our Rust libraries to other languages

That's the second time in a week that I see new bioinformatics tools with a vibe-coded translation of our Rust libraries to C/C++.

I have two major issues with that:

3 weeks ago 33 10 2 4

Myloasm, our long-read metagenome assembler, is now published! w/ @mgmarin.bsky.social and @lh3lh3.bsky.social

Very rewarding after > a year of development and countless hours thinking about assembly. Thanks to beta testers, Li lab, and reviewers who gave very helpful feedback.

rdcu.be/famFj

3 weeks ago 98 56 4 1

A run-length-compressed skiplist data structure for dynamic GBWTs supports time and space efficient pangenome operations over syncmers www.biorxiv.org/content/10.64898/2026.03...

3 weeks ago 6 3 0 0

Why phylogenies compress so well: combinatorial guarantees under the Infinite Sites Model www.biorxiv.org/content/10.64898/2026.03...

3 weeks ago 9 6 0 0

Lucas Czech: Fast Iteration of Spaced k-mers https://arxiv.org/abs/2603.25417 https://arxiv.org/pdf/2603.25417 https://arxiv.org/html/2603.25417

3 weeks ago 4 3 0 0

Amazing talk, I didn't know about left-right structures!

3 weeks ago 2 0 0 0

Damn, your swap is as large as my entire RAM!

3 weeks ago 1 0 1 0

Niceee, was waiting for this one, and already used it in a few places 😆

Congrats @imartayan.bsky.social et al :)

4 weeks ago 8 3 0 0

Super Bloom: Fast and precise filter for streaming k-mer queries www.biorxiv.org/content/10.64898/2026.03...

1 month ago 22 13 0 1

Kamila Szewczyk, Sven Rahmann: Hecate: A Modular Genomic Compressor https://arxiv.org/abs/2603.15390 https://arxiv.org/pdf/2603.15390 https://arxiv.org/html/2603.15390

1 month ago 5 4 0 0

Sensitive and scalable metagenomic classification using spaced metamers, reduced alphabets, and syncmers www.biorxiv.org/content/10.64898/2026.03...

1 month ago 13 8 0 1

It's a good day when the first item in your feed is your own work :)

@rickbitloo.bsky.social was annoyed that scanning reads for all 96 rapid kit barcodes is bottleneck in Barbell, so he made Sassy2: 13x (150bp) to 4.6x (8kbp) faster than v1 by batch-searching patterns, and >100Gbp/s on 16 threads!

1 month ago 24 11 2 0

Call for Papers RECOMB-Seq 2026 Web Page

⏰ Last chance! Register your submission with a placeholder abstract for #RECOMBSeq TODAY. Final paper due March 15, 2026, 23:59 AoE
Submit now 👉 recomb-seq.github.io/seq2026/call...

1 month ago 4 3 0 0

Excited to share this preprint that describes my latest work on using GPUs to accelerate processing of RNA-seq data.

The title says it all: "RNA-seq analysis in seconds using GPUs" now on biorxiv www.biorxiv.org/content/10.6... and github github.com/pachterlab/k...

Figure 1 shows they key result

1 month ago 187 87 6 8

Customizable Contraction Hierarchies -- A Survey This work establishes the technical fundamentals of a well-tuned Customizable Contraction Hierarchies (CCH) implementation that is simple and elegant. We give a detailed overview of the state of the a...

I got nerd-sniped into writing some code for Customizable Contraction Hierarchies, which is an insanely cool and super simple data structure for fast route planning.

Nice survey paper by Michael Zündorf et al: arxiv.org/abs/2502.10519

My unpolished implementation nodes: curiouscoding.nl/posts/cch/

1 month ago 3 1 1 0

Deacon can now run in the browser using WebAssembly. Sequence data never leaves your machine. It currently supports FASTA/Q filtering using indexes up to 1GB in size.

Demo: bede.im/deacon

1 month ago 30 10 1 1

Posts by Igor Martayan