Almost 5 years in the making... "Hyperparameter Optimization in Machine Learning" is finally out! 📘
We designed this monograph to be self-contained, covering: Grid, Random & Quasi-random search, Bayesian & Multi-fidelity optimization, Gradient-based methods, Meta-learning.
arxiv.org/abs/2410.22854
Posts by Mathias Niepert
🚨 New preprint: How well do universal ML potentials perform in biomolecular simulations under realistic conditions?
There's growing excitement around ML potentials trained on large datasets.
But do they deliver in simulations of biomolecular systems?
It’s not so clear. 🧵
1/
Demokratische Kontrolle von KI und Abschätzung von Nutzen und Gefahren ist extrem wichtig. Eine Politisierung von Technologien und eine damit verbundene Technologiefeindlichkeit ist eine extrem schlechte Idee.
Statt steiler Thesen über „faschistoide KI“ und Eugenik braucht es empirische Forschung: Was denken KI-Forschende und Unternehmen wirklich? Wie lässt sich Missbrauch wirksam begrenzen? Einzelmeinungen wie die von Musk zum Mainstream zu erklären, schafft nur ideologische Zerrbilder.
Anji is an amazing mentor and colleague. If I could go for another PhD in CS I would apply!
🚨ICLR poster in 1.5 hours, presented by @danielmusekamp.bsky.social :
Can active learning help to generate better datasets for neural PDE solvers?
We introduce a new benchmark to find out!
Featuring 6 PDEs, 6 AL methods, 3 architectures and many ablations - transferability, speed, etc.!
Authors: Marimuthu Kalimuthu, @dholzmueller.bsky.social, @mniepert.bsky.social
Full text: openreview.net/forum?id=OCM...
The slides for my lectures on (Bayesian) Active Learning, Information Theory, and Uncertainty are online now 🥳 They cover quite a bit from basic information theory to some recent papers:
blackhc.github.io/balitu/
and I'll try to add proper course notes over time 🤗
[9/n] Beyond Image Generation
LD3 can be applied to diffusion models in other domains, such as molecular docking.
Want to turn your state-of-the-art diffusion models into ultra-fast few-step generators? 🚀
Learn how to optimize your time discretization strategy—in just ~10 minutes! ⏳✨
Check out how it's done in our Oral paper at ICLR 2025 👇
Welcome to our Bluesky account! 🦋
We're excited to announce ComBayNS workshop: Combining Bayesian & Neural Approaches for Structured Data 🌐
Submit your paper and join us in Rome for #IJCNN2025! 🇮🇹
📅 Papers Due: March 20th, 2025 📜
Webpage: combayns2025.github.io
🚀 Exciting news! Our paper "Learning to Discretize Diffusion ODEs" has been accepted as an Oral at #ICLR2025! 🎉
[1/n]
We propose LD3, a lightweight framework that learns the optimal time discretization for sampling from pre-trained Diffusion Probabilistic Models (DPMs).
Very excited to announce the Neurosymbolic Generative Models special track at NeSy 2025! Looking forward to all your submissions!
Catch my poster tomorrow at the NeurIPS MLSB Workshop! We present a simple (yet effective 😁) multimodal Transformer for molecules, supporting multiple 3D conformations & showing promise for transfer learning.
Interested in molecular representation learning? Let’s chat 👋!
We will run out of data for pretraining and see diminishing returns. In many application domains such as in the sciences we also have to be very careful on what data we pretrain to be effective. It is important to adaptively generate new data from physical simulators. Excited about the work below
I'll present our paper in the afternoon poster session at 4:30pm - 7:30 pm in East Exhibit Hall A-C, poster 3304!
Neural surrogates can accelerate PDE solving but need expensive ground-truth training data. Can we reduce the training data size with active learning (AL)? In our NeurIPS D3S3 poster, we introduce AL4PDE, an extensible AL benchmark for autoregressive neural PDE solvers. 🧵
Join us today at #NeurIPS2024 for our poster presentation:
Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing
🗓️ When: Wed, Dec 11, 11 a.m. – 2 p.m. PST
📍 Where: East Exhibit Hall A-C, Poster #4107
#MachineLearning #InteratomicPotentials #Equivariance #GraphNeuralNetworks
"Transferability of atom-based neural networks" authored by @januseriksen.bsky.social (thanks for publishing with us, amazing work!) is now out as part of the #QuantumChemistry and #ArtificialIntelligence focus collection #MachineLearningScienceandTechnology. Link: iopscience.iop.org/article/10.1...
1/6 We're excited to share our #NeurIPS2024 paper: Probabilistic Graph Rewiring via Virtual Nodes! It addresses key challenges in GNNs, such as over-squashing and under-reaching, while reducing reliance on heuristic rewiring. w/ Chendi Qian, @christophermorris.bsky.social @mniepert.bsky.social 🧵
New #compchem paper out in MLST. We study the transferability of both invariant and equivariant neural networks when training these either exclusively on total molecular energies or in combination with data from different atomic partitioning schemes:
iopscience.iop.org/article/10.1...
You should take a look at this if you want to know how to use Cartesian (instead of spherical) tensors for building equivariant MLIPs.
📣 Can we go beyond state-of-the-art message-passing models based on spherical tensors such as #MACE and #NequIP?
Our #NeurIPS2024 paper explores higher-rank irreducible Cartesian tensors to design equivariant #MLIPs.
Paper: arxiv.org/abs/2405.14253
Code: github.com/nec-research...
We analyzed the behavior of Gumbel softmax in complex stochastic computation graphs. It’s a combination of vanishing gradients and a tendency to fall into poor local minima, underutilizing available categories. We also have some ideas for improvements. proceedings.neurips.cc/paper/2021/h...
@ropeharz.bsky.social forced me to do this starter pack on #tractable #probabilistic modeling and #reasoning in #AI and #ML
please write below if you want to be added (and sorry if I did not find you from the beginning).
go.bsky.app/DhVNyz5
Amazing opportunity for #Neurosymbolic folks! 🚨🚨🚨
We are looking for a Tenure Track Prof for the 🇦🇹 #FWF Cluster of Excellence Bilateral AI (think #NeSy ++) www.bilateral-ai.net A nice starting pack for fully funded PhDs is included.
jobs.tugraz.at/en/jobs/226f...
🙋♂️
I haven’t read it carefully, but +1 to works like the one below. It mentions learning artifacts from discreetness. We saw some things like that in this paper, where bad integration of the true Hamiltonian did worse than a learned model (that absorbed artifacts).
arxiv.org/abs/1909.12790
🙋♂️