Advertisement · 728 × 90

Posts by Ritesh Oedayrajsingh Varma

Soon (tm)!

1 month ago 3 0 2 0

#neverforget

1 month ago 1 0 0 0
Preview
From profiling to kernel patch: the journey to an eBPF performance fix | Ritesh Oedayrajsingh Varma A story about how an innocent profiling session led to a change to the Linux kernel that makes eBPF map-in-map updates much faster.

Oh and

> I don't know much about eBPF

The previous article, which also involved a dive into the kernel, has a lot of info on it:

rovarma.com/articles/fro...

1 month ago 1 0 0 0

Thanks Jaap!

> Do you guys ever do presentations somewhere? Would be cool to hear some fun stories!

You’re kinda looking at it :-)

1 month ago 1 0 1 0

There was this one time where ETW was broken for many people for like an entire year on Windows. Soooo… 😂

At least on Linux we can fix the issues ourselves!

1 month ago 3 0 1 0

It’s a good article!

And in the kernel’s defense… they often don’t really have an alternative to spinlocks, especially in cases like this.

But definitely goes to show that getting anything to do with spinlocks right is really hard.

1 month ago 1 0 1 0
Preview
A tale about fixing eBPF spinlock issues in the Linux kernel | Ritesh Oedayrajsingh Varma A system freeze led us deep into Linux spinlock internals, where we helped find not one but three bugs in the kernel's resilient locking code used by eBPF.

New article! A user is reporting full system freezes while using Superluminal on Linux. What do you do? Cry? Well, we did a little bit.

But we also dove into the kernel...again, this time fixing several issues in eBPF's spinlock implementation. Read all about it:

rovarma.com/articles/a-t...

1 month ago 40 12 3 4
Advertisement

Somehow missed this latest piece of technical wizardry from Stefan. My first thoughts were “this is awesome, but looks super hard to get into a reliable state”.

But I thought the same thing about Live++ and Stefan knocked it out of the park there. If anybody can make this happen, it’s Stefan :-)

2 months ago 7 0 0 0
Video

We've just released a new Insider update with some much-requested features, like being able to specify env vars when running, auth support for symbol servers, and proper progress reporting for symbol downloads. And of course, many fixes & QoL improvements.

Go check it out!

2 months ago 8 4 0 0

Check out this new article by Jelle about how we stream unsorted data in sorted order to ensure a fixed upper memory bound while processing gigabytes of capture data in Superluminal!

3 months ago 6 0 0 0
Preview
From profiling to kernel patch: the journey to an eBPF performance fix | Ritesh Oedayrajsingh Varma A story about how an innocent profiling session led to a change to the Linux kernel that makes eBPF map-in-map updates much faster.

New article! What do you do when profiling your code shows the slowdown isn't in your code, but deep in the kernel? Why, you grab the kernel source and go spelunking.

How a routine profiling session turned into a Linux kernel patch: rovarma.com/articles/fro...

4 months ago 30 12 1 1

Thanks! We’re not using this, and I don’t think we’d even be able to correctly open captures made with this option currently. Good to know about it!

Re: slowing down the capture, compared to “not doing anything at all”, I can definitely see this being slower indeed.

4 months ago 2 0 0 0

We could, yeah, but that has the disadvantage that other tools wouldn’t be able to open Superluminal captures anymore. Could still be worth it as an option as you say.

For the Linux version we’re doing everything ourselves, and captures there are *much* smaller as a result.

4 months ago 1 0 1 0

> if you're interested

definitely!

4 months ago 0 0 1 0
Advertisement

The ETW file itself is just a straight dump of the raw data without further processing. The goal there is to keep the overhead of capturing low, which means doing as little as possible to log data. Even compression doesn’t happen until after the capture is done.

4 months ago 1 0 1 0

My co-founder Jelle wrote an article about a custom data structure he came up with for Superluminal to efficiently store millions of callstacks.

Check it out!

4 months ago 7 2 0 0
Preview
Optimizing libdwarf .eh_frame enumeration | Ritesh Oedayrajsingh Varma For the Linux version of Superluminal we rely on unwind information stored in the .eh_frame section in a binary to perform stack unwinding. We’ll go over optimizations we made to libdwarf that greatly...

I've been wanting to start a blog for a while, and finally decided to bite the bullet.

The first article of hopefully many more to come is about, you guessed it, profiling & optimization.

RTs appreciated!

rovarma.com/articles/opt...

4 months ago 12 7 0 1

Great post!

Including a sneak peek of a certain profiler on a platform that is very much not Windows ;-)

6 months ago 3 1 0 1
Preview
Speeding up the Unreal Editor launch by … not opening 5500 files? In my last article I wrote about some tooltip optimization to reduce the start time of the Unreal Editor by 2-5 seconds. Turns out people do really care about their editor start time. So much that …

It's understandable that Unreal needs to touch a lot of files when starting the editor. But what if I told you that >5500 of those files are not needed for the editor to start at all and are just adding seconds to the editor launch time?
(Fix included!)
#u5 #gamedev
larstofus.com/2025/09/27/s...

6 months ago 37 12 0 0

to be fair, you could have seen this coming from the “runs inside the terminal” as if that is something positive :p

6 months ago 3 0 0 0

Nice investigation! Sampling profilers > instrumenting profilers when you need to see what’s happening in code you *didn’t* write. Great example of the right tool for the job!

8 months ago 5 1 2 0
Preview
Profiling without Source code – how I diagnosed Trackmania stuttering A very common side effect of working as a programmer is the constant frustration of not having source code access to all the software you use. Bugs, problems or missing features in your own work ca…

My new blog post is there, and it's a bit different from usual: Fixing stutters in your own code is hard enough, but this time I try to fix performance issues in a closed-source game. No source code or debug symbols, but a lot of guesswork. larstofus.com/2025/07/27/p...
#gamedev
#Trackmania

8 months ago 104 20 3 1
Advertisement

Days since I've had to waste time debugging obscure issues caused by Linux's deranged shared library model: 0

"Nice that you're linking to a static library, but there's a shared lib loaded with the same symbol name in it, so I'm gonna use that one instead, ok?"

9 months ago 4 0 0 0
Post image

tfw you're collateral damage in the Great AI Wars

9 months ago 0 0 0 0

This was a great example of "how hard can it be?". Well, 4 days of full-time work fighting with Qt, that's how hard.

So glad you like it! ;-)

9 months ago 4 0 1 0

It turns out when you’re writing code that runs on each sample interval to collect stacks, you don’t have a lot of time if you’re targeting high sampling rates :-)

9 months ago 2 0 0 0
Post image Post image

We've been micro-optimizing our eBPF code, and it reminds me of the SPU era a bit. The compiler/JIT is so basic that old tricks are useful again. Regular C turns into atrocious ASM, but writing C like it's ASM fixes it. I'm kinda loving it.

It's all stuff like this (before/after):

9 months ago 7 0 1 0

Solved it by the ancient tradition of Just Reading The Code.

Turns out continuously taking the RCU lock by inserting thousands of elements into a BPF_MAP_TYPE_LRU_HASH from within a NMI is Not Good for your system.

Rolled our own (simpler) version directly in eBPF.

9 months ago 4 1 0 0

How does one diagnose the entire Linux system locking up when using a particular eBPF data structure? Are there any post-mortem logs to look at? dmesg is only about the current session.

Asking for a friend.

9 months ago 2 0 0 1
Advertisement

In our case we’re looking at optimizing the perf of a single program, so an overview of which programs are running and how much time they cost is not that useful; we want to know which of the thousands of lines of code in *our* programs we need to focus on :-)

10 months ago 1 0 0 0