🎓 Looking for PhD students, postdocs & interns!
I’m recruiting for my new lab at NUS School of Computing, focusing on generative modeling, reasoning, and tractable inference.
💡 Interested? Learn more here: liuanji.github.io
🗓️ PhD application deadline: June 15, 2025
Posts by Benjie Wang
What happens if we tokenize cat as [ca, t] rather than [cat]?
LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself.
#AI #LLMs #tokenization #alignment
Also check out the awesome paper "Sum of Squares Circuits" (arxiv.org/pdf/2408.11778) by @loreloc_, Stefan Mengel, and @tetraduzione, which concurrently showed the separation between monotone and squared circuits. Also at AAAI 2025 today poster #840!
Inception PCs strictly subsume monotone and squared PCs, and are strictly more expressive than both. We show this leads to improved downstream modeling performance when normalizing for FLOPS:
To overcome these limitations, we propose Inception PCs, a novel tractable probabilistic model representing a deep *sum-of-square-of-sums*.
Inception PCs explicitly introduce two types of latent variables into the circuit for the mixtures encoded at sum nodes.
We show that the reverse also holds (!!) - some tractable distributions expressed as monotone circuits cannot be compactly expressed as a square.
On the other hand, squared circuits (arxiv.org/abs/2310.00724) allow use of arbitrary real parameters by *squaring* the circuit output. It was previously proven that squared circuits can be exponentially more expressive than monotone circuits!
Probabilistic circuits are deep *tractable* probabilistic models that allow efficient and exact computation of marginals.
Traditionally, monotone circuits enforce non-negativity by using non-negative weights.
Paper: arxiv.org/abs/2408.00876
Circuits are generative models that use sum-product computation graphs to model probability densities. But how do we ensure the non-negativity of the output?
Check out our poster "On the Relationship between Monotone and Squared Probabilistic Circuits" at AAAI 2025 **today**: 12:30pm-14:30pm #841.
Want to turn your state-of-the-art diffusion models into ultra-fast few-step generators? 🚀
Learn how to optimize your time discretization strategy—in just ~10 minutes! ⏳✨
Check out how it's done in our Oral paper at ICLR 2025 👇
If you are interested in doing a #PhD with me at Imperial College London and qualify as a home student, please reach out (before end of 2024)! Potential topics: spatial statistics, applied deep generative models, probabilistic programming and more.
Thanks Devendra!
Thanks to my amazing co-authors Denis Mauá, @yjchoi1.bsky.social, @guyvdb.bsky.social. Hope to see you at the poster session!
Tractability results on case studies
Along the way we also show a bunch of other cool results, like:
- More efficient algorithms for causal inference on circuits
- New circuit properties
- Separation/hardness results
Table depicting the atlas of tractability conditions
Building upon the prior PC atlas (proceedings.neurips.cc/paper_files/... ), our algebraic atlas provides a comprehensive approach for deriving **efficient algorithms** and **tractability conditions** for arbitrary compositional queries.
Try our atlas the next time you come across a new query!
PASP query as a composition
Just as circuits serve as a unifying representation of models, we show how you can express many queries as compositions of just a few basic operations: aggregation (marginalization, max, etc.), product, and elementwise mappings.
Illustration of Probabilistic CIrcuit
Circuits are a unifying representation of probability distributions as a computation graph of sums and products. Here we consider the more general algebraic circuits, where sum/product is replaced with a semiring operation (think e.g. OR and AND for Boolean circuits).
You have some model/knowledge (e.g. Bayes Net, Probabilistic Circuit, Probabilistic/Logic Program, DB) and some query (e.g. MAP, Causal Adjustment) you want to ask. When can you compute this efficiently?
Find out @ NeurIPS today in Poster Session 6 East, #3801.
Paper: arxiv.org/abs/2412.05481
Hi! I work on prob ML & tractable models.