I am not particularly bullish, but I do think that Google will eventually pull ahead, they are the only player that is not entirely dependent on large cash injections through their other revenue streams and I think that eventually the other companies will not be able to cater to their investors
Posts by Hidde Fokkema
Go check out the package I wrote for the robust regression and estimation procedure Mathie Gerber and @pierrealquier.bsky.social developed based on the Maximum Mean Discrepancy principle!
🚨2 PhD positions with me @amlab.bsky.social on learning causally grounded concepts 🚨
Are you interested in improving the #interpretability #robustness and #safety of AI by integrating #causal reasoning? Join us in beautiful Amsterdam 🇳🇱🌷🚲
Deadline: 20 April
www.academictransfer.com/en/jobs/3593...
Excited for Damien’s seminar talk this Thursday!🚀
Steering is an exciting area in interpretability—but how strongly should we steer?
Damien will present a theory of steering strength: choosing the right magnitude of representation change—not too weak, not too strong
tverven.github.io/tiai-seminar/
At #NeurIPS in San Diego this week? Interested in XAI, causality, or performative prediction? Come visit our poster!
💬 Performative Validity of Recourse Explanations
📆 Wednesday, 4.30 pm, Poster Session 2
w/ Hidde Fokkema, Timo Freiesleben, Celestine Mendler-Dünner, Ulrike von Luxburg
When reading a large literature, it is really helpful to have opinionated opinions that help to categorize papers.
One of mine for explainable AI is that methods need to address a fundamental limit in how much information can be communicated.
Blog post: www.timvanerven.nl/blog/xai-com... (no math)
I did some googling and this article has a surprisingly nice and pedagogical discussion on this, with a similar conclusion to your idea.
tinyurl.com/52a3whac
And I found that I missed the opportunity to make the joke that the posterior of the simpler model is "Sharper", keeping the razor theme.
(2/2) if we see the complicated model and simple model as 2 different hypothesis classes, with 2 seperate priors, then the posterior for the more complicated class will be flatter than the posterior of the simple class, which is what you want I think.
(1/2) Fair point, I think my point was that anything Bayesian is prior related. So with the correct prior you could at least recover Ockham's razor, but not really derive it. But my thinking is a bit different I think, as in my points above the hypothesis classes are the same.
In your idea, if ..
(7/n=7) So, in the end, you can get Ockham's razor if your prior is that simple explanations (read explanations with less parameters) are more likely than complicated ones. For binary parameters you could write the prior explicitly. For real valued parameters this becomes impossible (I am guessing)
(6/n) Now if you really want to derive Ockham's razor, in the sense of minimum assumptions, or really the number of parameters, you would need a prior distribution that assigns more probability mass to simple models.
(5/n) Similarly, if β ~ Laplace(0, σ^2), then you get the Lasso objective
min ||y - <β, x> ||^2 + λ || β ||_1
where we now have the 1-norm as regularization penalty. This one has the added benefit that irrelevant parameters are set to 0, which resembles the original Ockham's razor principle more
(4/n) writing out the posterior likelihood and performing maximum likelihood on the parameters (Maximum a postiori Bayesian inference). How much you regularise is determined by σ, and there is relation to λ.
(3/n) Let's say we consider as possible models all linear models and complexity measure the euclidean norm of the parameters. (This is ridge regresions). Then, we would retrieve the optimisation problem:
min ||y - <β, x>||^2 + λ||β||^2
By assuming that β ~ N(0, σ^2) and ...
(2/n) In particular, this would give you the model with the least amount of assumptions, if you consider 2 models that explain the data equally well, but one has less assumptions and that is the complexity measure you consider.
Sure! Here are some thoughts
(1/n) I would see Ockham's razor as the following optimisation problem:
min Error(data, model) + Complexity(model)
Where you minimise over all models.
If you see Ockham's razor as a regularization mechanism, because you optimize to fit the data and minimizing the parameters, then there are explicit connection. For example ridge regression follows from assuming a gaussian prior on the parameters and Lasso regression follows from a Laplace prior
Interested in provable guarantees and fundamental limitations of XAI? Join us at the "Theory of Explainable AI" workshop Dec 2 in Copenhagen! @ellis.eu @euripsconf.bsky.social
Speakers: @jessicahullman.bsky.social @doloresromerom.bsky.social @tpimentel.bsky.social
Call for Contributions: Oct 15
4th AI & Mathematics in NL workshop in Tilburg.
Many cool presentations: aimath.nl/index.php/20...
And great people:
Deadlines for PhD and Postdoc vacancies coming up: applications open until Monday June 2!
Now open: vacancies for a PhD or Postdoc position to develop "Mathematical Foundations for Explainable AI" with me.
This is a new research direction that I am very excited about, and which will really start to take off over the next few years.
Come join my group: www.timvanerven.nl#open-phd-and...
Jelle Goeman (Leiden University Medical Center) and I (University of Twente) have two PhD positions on e-values and multiple testing. The students will be co-supervised by both of us. A strong theoretical mathematical background is required.
utwentecareers.nl/en/vacancies...
Hooray, I received a Vici grant from the Dutch science foundation!
Heads up for current PhD students in learning theory: I will have two postdoc positions available in Amsterdam on "learning theory for interpretable/explainable AI" in the coming years.
www.uva.nl/shared-conte...
Reposting to see if we can get some input on what the community is eager to see in the seminar:
Great talk by Jeremias Sulam in the interpretable AI seminar today, connecting feature and concept interpretability to hypothesis testing via E-values!
Recording available in our YouTube channel: youtu.be/cx7wTtRdhnA
Check out the seminar website for upcoming talks: tverven.github.io/tiai-seminar/
Ohw and the timing of the Q2B conference being this week probably also factors in. So they can hype it a bit more there
My guess would be because the nature version of the article was just published?
📢Theory of Interpretable AI Seminar📢
On Thursday, Depen Morwani from Harvard will present work on how **margin maximization** can explain observed phenomena in mechanistic interpretability!
⏲️Thursday Dec 5th, 4pm CET / 10am EST
🌐https://tverven.github.io/tiai-seminar/
Aren't these dual numbers? I think Julia has some autodiff packages based in this idea