My work with @marklemley.bsky.social and others on extracting copyrighted books from language models recently featured in the UK House of Lords, Communications and Digital Committee report on AI, copyright and the creative industries
publications.parliament.uk/pa/ld5901/ld...
Posts by Jaydeep Borkar
So glad to see this work (& other memorization works) being cited in Congressional testimony before the U.S. House Committee on Energy and Commerce. Dr. Jennifer King (Stanford HAI) talks about memorization of personal information & data privacy risks.
More details: hai.stanford.edu/assets/files/t…
I gave a talk at the Google Privacy in ML Seminar last summer on privacy & memorization: "Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training".
It's up on YouTube now if you're interested :)
youtu.be/IzIsHFCqXGo?...
re architecture/training dynamics, this is one of my fav plots that shows how architecture specific inherent biases influence which examples to memorize
bsky.app/profile/jayd...
Some of these results really changed how i think about memorization. It’s not simply data. There are many factors (data properties, architecture, training objectives, capacity etc) that interplay to determine what exactly gets memorized.
Microsoft Research NYC is hiring a researcher in the space of AI and society!
Paper: arxiv.org/pdf/2601.15394
Joint work with my incredible co-authors: Karan Chadha, Niloofar Mireshghallah, Yuchen Zhang, Irina-Elena Veliche, Archi Mitra, @dasmiq.bsky.social, Zheng Xu, Diego Garcia-Olano!
Finally, we compare soft (logit-level) vs. hard (sequence-level) distillation. Hard KD is often used when teacher logits are inaccessible. We find that while both show similar rates, hard KD is riskier, inheriting 2.7x more memorization from the teacher than soft KD.
We compute sequence log-prob & avg shannon entropy. We find cross-entropy pushes the model to overfit on examples it is uncertain about, resulting in forced memorization. In contrast, KD permits it to output a flatter, more uncertain distribution rather than forcing memorization
Why does distillation reduce memorization vs. fine-tuning w/ cross-entropy? We hypothesize that this could be due to the difference between the hard targets (one-hot labels) of cross-entropy and the soft targets (full probability distribution) of KL Divergence.
We can predict which examples the student will memorize before distillation! By training a log-reg classifier on features like zlib, KLD loss, PPL, we pre-identify these risks. Removing them from the training results in a significant reduction in memorization (0.07% -> 0.0004%).
Are these “easy” examples universal across architectures (Pythia, OLMo-2, Qwen-3)? We observe that while all models prefer memorizing low entropy data, they don’t **agree** on which examples to memorize. We analyzed cross-model perplexity to decode this selection mechanism.
This leads us to study why certain examples are easier to memorize? Since our data has no duplicates, duplication isn't the cause. In line with prior work, we compute compressibility (zlib entropy) and perplexity, & find that they are highly correlated with these "easy" examples.
Next, we find that certain examples are consistently memorized across model sizes within a family because they are **inherently easier to memorize**. We find that distilled models preferentially memorize these easy examples (accounting for over 80% of their total memorization).
We find the student recovers 78% of teacher’s generalization over the baseline (std. fine-tuning) while inheriting only 2% of its memorization. This shows the student learns the teacher’s general capabilities, but rejects majority of the examples the teacher exclusively memorized.
Excited to share my work at Meta.
Knowledge Distillation has been gaining traction for LLM utility. We find that distilled models don't just improve performance, they also memorize significantly less training data than standard fine-tuning (reducing memorization by >50%). 🧵
very cool work!!!
Syntax that spuriously correlates with safe domains can jailbreak LLMs - e.g. below with GPT4o mini
Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!
+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun
It is PhD application season again 🍂 For those looking to do a PhD in AI, these are some useful resources 🤖:
1. Examples of statements of purpose (SOPs) for computer science PhD programs: cs-sop.org [1/4]
"AI slop" seems to be everywhere, but what exactly makes text feel like "slop"?
In our new work (w/ @tuhinchakr.bsky.social, Diego Garcia-Olano, @byron.bsky.social ) we provide a systematic attempt at measuring AI "slop" in text!
arxiv.org/abs/2509.19163
🧵 (1/7)
After 2 years in press, it's published!
"Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain," is out in the 72nd volume of the Journal of the Copyright Society
copyrightsociety.org/journal-entr...
written with @katherinelee.bsky.social & @jtlg.bsky.social (2023)
it was soo fun!
Excited to be attending ACL in Vienna next week! I’ll be co-presenting a poster with Niloofar Mireshghallah on our recent PII memorization work on July 29 16:00-17:30 Session 10 Hall 4/5 (& at LLM memorization workshop)!
If you would like to chat memorization/privacy/safety/, please reach out :)
Big congratulations!! 🎊
congrats!! 🎊
big thanks to my wonderful co-authors Matthew Jagielski @katherinelee.bsky.social Niloofar Mireshghallah @dasmiq.bsky.social Christopher A. Choquette-Choo!!
Privacy Ripple Effects has been accepted to the Findings of ACL 2025! 🎉
See you in Vienna! #ACL2025
Very excited to be joining Meta GenAI as a Visiting Researcher starting this June in New York City!🗽 I’ll be continuing my work on studying memorization and safety in language models.
If you’re in NYC and would like to hang out, please message me :)
😂😂
I am at CHI this week to present my poster (Framing Health Information: The Impact of Search Methods and Source Types on User Trust and Satisfaction in the Age of LLMs) on Wednesday April 30
CHI Program Link: programs.sigchi.org/chi/2025/pro...
Looking forward to connecting with you all!