Jeremias Sulam (@jsulam) Bsky

@cpalconf.bsky.social !

5 months ago 0 0 0 0

Home Conference on Parsimony and Learning (CPAL) - Addressing the low-dimensional structures in high-dimensional data that prevail in machine learning, signal processing, optimization, and beyond.

Deadline for CPAL coming up on Dec 5! Submit your best work on Parsimony and Learning and come join us in Tübingen in March!
cpal.cc

5 months ago 0 0 1 0

Join Hopkins BME - Johns Hopkins Biomedical Engineering Hopkins BME is not only a great place to learn, it’s also a great place to work. Browse the listing of postdoctoral, research, and faculty openings.

The Biomedical Engineering Department at
@JohnsHopkins
is hiring! Do you work on data science and machine learning for biomedical problems? Consider applying - deadline for full consideration *Dec 5t*
www.bme.jhu.edu/careers-indu...

5 months ago 1 0 0 0

Many more details on derivation, intuition, implementation details and proofs can be found in our paper arxiv.org/pdf/2507.08956 with the amazing Zhenghan Fang, Sam Buchanan and Mateo Diaz @ @jhu.edu @hopkinsdsai.bsky.social

8 months ago 0 0 0 0

2) In theory, we show that prox-diff sampling requires only O(d/sqrt(eps)) to produce a distribution epsilon-away (in KL) from a target one (assuming oracle prox), faster than the score version (resp. O(d/eps)). Technical assumptions differ in all papers though, so exact comparison is hard (10/n)

8 months ago 1 0 1 0

1) In practice, ProxDM can provide samples from the data distribution much faster than comparable methods based on the score like DDPM from (Ho et al, 2020) and even comparable to their ODE alternatives (which are much faster) (9/n)

8 months ago 0 0 1 0

In this way, we generalize the Proximal Matching Loss from (Fang et al, 2024) to learn time-specific proximal operators for the densities at each discrete time. The result is Proximal Diffusion Models: sampling by using proximal operators instead of the score. This has 2 main advantages: (8/n)

8 months ago 1 0 1 0

So, in order to implement a Proximal/backward version of diffusion models, we need a (cheap!) way of solving this optimization problem, i.e. computing the proximal of the log densities at every single time step. If only there was a way… oh, in come Learned Proximal Networks (7/n)

8 months ago 0 0 1 0

What are proximal operators? You can think of them as generalizations of projection operators. For a given (proximable) functional \rho(x), its proximal is defined by the solution of a simple optimization problem: (6/n)

8 months ago 0 0 1 0

Backward discretization of diff. eqs. has been long studied (c.f. gradient descent vs proximal point method). Let’s go ahead and discretize the same SDE, but backwards! One problem: the update is defined implicitly... But it does admit a close form expression in terms of proximal operators! (5/n)

8 months ago 0 0 1 0

Crucially, this step relies on being able to compute the score function. Luckily, Minimum Mean Squared Estimate (MMSE) denoisers can do just that (at least asymptotically). But, couldn’t there be a different discretization strategy for this SDE, you ask? Great question! Let's go *back*... (4/n)

8 months ago 0 0 1 0

While elegant in continuous time, one needs to discretize the SDE to implement it in practice. In DF, this has always been done through forward discretization (e.g Euler-Maruyama), which combines a gradient step of the data distribution at the discrete time t (the *score*), and Gaussian noise: (3/n)

8 months ago 0 0 1 0

First, a (very) brief overview of diffusion models (DM). DM work by simulating a process that converts samples from a distribution (random noise) to samples from target distribution of interest. This process is modeled mathematically with a stochastic differential equation (SDE) (2/n)

8 months ago 0 0 1 0

Check this out 📢 Score-based diffusion models are powerful—but slow to sample. Could there be something better? Drop the scores, use proximals instead!

We present Proximal Diffusion Models, providing a faster alternative both in theory* and practice. Here’s how it works 🧵(1/n)

8 months ago 1 0 1 0

Awesome to see our cover in @cp-patterns.bsky.social finally out! And kudos go to Zhenzhen Wang for her massive work on biomarker discovery for breast cancer
www.cell.com/patterns/ful...

1 year ago 2 1 0 0

Today, on #WomenInScience day, this paper on biomarker discovery for breast cancer, by my amazing student Zhenzhen, has just appeared in @cp-patterns.bsky.social
🎉 Her work shows how to construct fully interpretable biomarkers employing bi-level graph learning! @jhu.edu @hopkinsdsai.bsky.social

1 year ago 2 1 0 1

Wanna bet? Testing conceptual importance for more explainable AI Johns Hopkins researchers used betting strategies to help clarify AI models’ decision-making processes.

Nice write-up by @JHUCompSci about @JacopoTeneggi's work. Puch-line: interpretability of opaque ML models can be posed as hypothesis tests, for which online (efficient) testing procedures can be derived! www.cs.jhu.edu/news/wanna-b...

1 year ago 1 0 0 0

📣 What should *ML explanations* convey, and how does one report these precisely and rigorously? @neuripsconf.bsky.social
come check
Jacopo Teneggi's work on Testing for Explanations via betting this afternoon! I *bet* you'll like it :) openreview.net/pdf?id=A0HSm... @hopkinsdsai.bsky.social

1 year ago 1 0 1 0

realSEUDO for real-time calcium imaging analysis Closed-loop neuroscience experimentation, where recorded neural activity is used to modify the experiment on-the-fly, is critical for deducing causal connections and optimizing experimental time. A cr...

NeurIPS paper: Excited for our work (with Iuliia Dmitrieva+Sergey Babkin) on

"realSEUDO for real-time calcium imaging analysis"
arxiv.org/abs/2405.15701

to be presented tomorrow (Thu 4:30-7:30PM). realSEUDO is a fully on-line method for cell detection and activity estimation that runs at >30Hz.

1 year ago 3 2 1 0

Posts by Jeremias Sulam