M. Lindberg can be a hallucination of T. Lindeberg, he is also Swedish.
Posts by Peyman M. Kiasari
Thank you!
Thats a good question and we have to clarify this in camera ready version.
In Table 1, in "Acc with 8 (greedy search)", we have calculated the proportions of each kind of filter in each layer; we use that in Table 2.
Our paper has been accepted to #NeurIPS2025 as poster 🥳
Looking forward to presenting our poster.
Of our six challenges, I expect future AI to solve 'shortest path' first, but GPT-5 still has a long way to go.
Is #GPT5 the dreamed AGI? Not even close!
It still can't solve our easiest task, which humans score 100%: What is the shortest path between the two square nodes?
This is from our challenge the Visual Graph Arena (vga.csail.mit.edu)
Every time I write a paper, I ask the best LLM I have to explain it to me in detail.
Never got it right. Always made up delusional stuff.
Happening right now at #ICML2025 poster session 4 west W-214.
Would be glad to see you and have a chat.
We’re presenting our work today at #ICML2025, Poster Session 4 West, W-214!
If you are interested in computer vision reasoning and multimodal LLMs come visit us!
I'm at #ICML see you there :)
Our paper "Visual Graph Arena: Evaluating AI's Visual Conceptualization" has been accepted at #ICML2025! 🎉🥳🎉
We introduce a dataset testing AI systems' ability to conceptualize graph representations.
Available at: vga.csail.mit.edu
More info + Camera ready version coming soon!
Thanks for your reply Paul. I appreciate the clarification.
I'll be looking forward to seeing more of your work in the future.
Ironically, my post engagements are higher on 🦋 than on Twitter.
Camera-ready version is out! (arxiv.org/abs/2412.16751)
TL;DR: Deep CNN filters may be generalized, not specialized as previously believed.
Major update(Fig4): We froze layers from end-to-start instead of start-to-end. The result ironically suggests early layers are specialized!
Hi Paul,
I finally managed to look at Eq4.
I believe it doesn't represent DS-CNNs. Each kernel is convolved into a separate feature map, and you can't factor them out. (I marked the part I don't think represents DS-CNN in red)
Overall, DS-CNNs are not LCing kernels. They are LCing featuremaps.
This is great work! and actually, we've cited you on (arxiv.org/abs/2401.14469)
Maybe we can cite it again on "Master key filters" again as another visual evidence 👌
Nice work, by the way.
Thanks, I'll read it as soon as I get the chance to see that.
Hi Paul, Thank you for joining!
Actually, two people referenced your work : D
Please correct me if I'm wrong. Are you sure that pointwise layers are LCing the "filters"? I'm having difficulties seeing that.
If we name filters as K and features as F, how can this result in LC of Ks?
That's a good point. No they are not. Actually, In our paper (DS-CNNs models) each filter gets convolved into a separate feature map and generates a distinct new feature map - The new feature maps are linearly combined but not the filters.
IMHO, they are not actually using frozen filters. Let me explain myself:
A model learning (x,y,z) is mathematically equivalent to one using LC of "frozen" filters (1,0,0), (0,1,0), (0,0,1). They're doing the same optimization, just expressed differently. Same goes for LC of random filters.
This is absolutely a great suggestion and we have to have this experiment too! Our GPUs are currently working for ICML, and after that I definitely do this experiment before AAAI last refinement deadline. I'll update you on this. (but in case you wonder about only frozen pointwise see Table 5)
I found out that they they create new filters through linear combinations of random filters, which isn't what we're doing. 🤔
And mathematically, LC of 49 random filters should span the entire 7x7 space, so it's not surprising that it works.
Open to discussion if I'm misunderstanding something!
Fascinating that you mention this paper - our area chair noted this connection too! (Hat tip to the author @paulgavrikov.bsky.social)
After reading the paper, TBH, I couldn't see a deep connection. And I'm open to being wrong since you and AC both pointed this out. If I am wrong, please correct me.
I've explained those classical CNN parts in these two tweets:
bsky.app/profile/kias...
and
bsky.app/profile/kias...
Hi Neil, Thank you for showing interest in our work!
We have experimented with both DS-CNNs and classical CNNs (ResNets in our paper, and you are right that our main focus was DS-CNNs). In DS-CNNs we only frozen depthwise filters but in classical CNNs all params are frozen just like Yosinski did.
13/13 If you prefer video content, you can check out the video I made for AAAI:
youtube.com/watch?v=lzhzm1…
Thanks for reading! I wish a great day for you.
#DeepLearning #ComputerVision #AAAI2025
12/13 Thank you for reading this far! Curious to hear your thoughts on this.
You can find our complete paper "The Master Key Filters Hypothesis" here: arxiv.org/pdf/2412.16751
11/13 We went even further, transferring FROZEN❄️ filters between two different architectures (Hornet → ConvNeXt) trained on unrelated datasets (Food → Pets) improved accuracy by 3.1%. We suggest these "master key" filters are truly architecture-independent!
10/13 To test this, we conducted cross-dataset FROZEN❄️ transfer experiments. If true, filters trained on larger datasets should help models perform better on smaller, even unrelated datasets - as they'd better converged to these "master key" filters.
Results were confirming.
9/13 We propose that there exist universal "master key" filter sets optimal for visual data processing. The depth-wise filters in DS-CNN naturally converge toward these general-purpose filters, regardless of the specific dataset, task, or architecture!