paraacha (@talhaparacha.com) Bsky

Also, I’m on the job market and really interested in industry positions around network security, measurement and software engineering (remote or based in Toronto). Please feel free to reach out if you have any leads. Thank you!

3 months ago 0 0 0 0

Last, a shoutout to Andrej Karpathy for the excellent writeup on "The Unreasonable Effectiveness of RNNs”. It was a really fascinating read during my undergrad (esp. the part on Linux source code). The idea for this project came into existence when I read about fuzzing in grad school.

3 months ago 0 0 1 0

For full details, and to understand the security implications of our findings, please find a preprint here (talhaparacha.com/icse_preprin...). We’ll be at #icse2026 in April to discuss results in person. 10/n

3 months ago 0 0 1 0

We also experiment with the Temperature parameter to control sampling strategy. We find that sampling in a conservative way makes more instances valid, at the expense of diversity in features. And that the other extreme adds too much randomness, which also hurts testing. 9/n

3 months ago 0 0 1 0

An example of a useful TLS certificate generated by our pipeline is shown here, where the date is set to be June 31 2037 (June has 30 days!). This certificate is rejected by all TLS libraries except one (another similar discrepancy was due to the use of leap seconds). 8/n

3 months ago 0 0 1 0

and (c) LLMs do not necessarily outperform RNNs in our experiments. We find the last part particularly interesting, given that RNNs have been available for over two decades and require substantially less resources. 7/n

3 months ago 0 0 1 0

(b) several models outperform Transcert, the current state-of-the-art (with the main model used in our paper generating 30% more distinct discrepancies), 6/n

3 months ago 0 0 1 0

We find that (a) our language models trigger significant number of unique discrepancies (26 out of a maximum possible of 30) -- a discrepancy is when a TLS library accepts a certificate, while others reject it, indicating a potential bug 5/n

3 months ago 0 0 1 0

We train RNNs (small and medium sized) and GPTs (fine-tuned and trained-from-scratch) since it is unclear which approach is better for testing, in contrast to just learning (also highlighted by Godefroid et al. in Learn&Fuzz, see the snippet below for a short discussion). 4/n

3 months ago 0 0 1 0

Our insight is that language models learn a representation for the textual data they are trained on, and that the learned representation is probabilistic and often imperfect, meaning a sampled instance can considerably break expectations (and may thus, help in testing). 3/n

3 months ago 0 0 1 0

We train language models on datasets of real-world TLS certificates, to generate synthetic instances for use in differential testing. 2/n

3 months ago 0 0 1 0

Super excited to share that I'll present our latest research at ICSE 2026. Our work (co-authors Kyle Posluns, @kevin.borgolte.me, @lindorfer.in, and @proffnes.discuss.systems.ap.brid.gy) explores the use of language models for software testing in TLS certificate validation logic. 1/n

3 months ago 4 2 1 1

Posts by paraacha