"AI doomer" media gets clicks and listens, but there is no logical reason to believe that today's frontier LLMs will "wake up" and become the terminator. The major problem is that “AGI” is an overloaded term. We should retire it: syntheticminds.substack.com/p/retiring-a...
Posts by Christopher Kanan
I was featured on the Academic Minute, which airs on over 70 radio stations around the US and Canada. In the segment, I introduce continual learning for the non-AI audience: academicminute.org/2025/02/chri...
#ai #academicminute #continuallearning
To be clear, I don’t believe we should halt AI progress. Higher education must adapt. But I worry that most universities, already overwhelmed by ongoing crises, lack the agility and foresight to make the tough decisions needed to survive.
As a professor working at the frontiers of AI, I’ve grown increasingly concerned about the cataclysmic impact AI could have on college enrollments in the coming decades—on top of the decline already underway for other reasons.
https://buff.ly/4hDInSq
#HigherEducation #AI #EnrollmentCrisis
Given that roughly half of the academic AI papers published in our top-tier conferences are produced by Chinese universities, this would catastrophically impair AI research in the USA if researchers cannot download code or weights if they were developed by Chinese institutions.
A proposed AI bill would, based on my read (and ChatGPT's), make it illegal in the USA to download AI code or weights created by Chinese companies, universities, etc. This is catastrophically shortsighted.
https://buff.ly/4hpjTfC
Bill: https://buff.ly/40Vyd9J
The only barrier is having access to the right kind of chips, and DeepSeek figured out how to more effectively use the chips they have. The learnings from DeepSeek about how to use FP8 will enable AI folks worldwide to get more from NVIDIA's newer chips.
A huge percentage of the PhD students trained in the USA in AI are Chinese, where we only have about 30% domestic students nationwide. We aren't getting domestic applications to do PhDs in the USA. Why people think China wouldn't have AI expertise confuses me.
I can see why Microsoft stock would be impacted by this news due to their OpenAI investment, but I really don't get the others. DeepSeek used FP8 on NVIDIA's chips to get a big boost in training, among other things, but I think this fear is overblown.
There are too many unknowns to justify using a fixed compute-based threshold. Policymakers should focus on regulating specific high-risk AI applications, similar to how the FDA regulates AI software as a medical device.
Lastly, many trying to scale LLMs beyond systems like GPT-4 have hit diminishing returns, shifting their focus to test-time compute. This involves using more compute to "think" about responses during inference rather than in model training, and the regulation does not address this trend at all.
It is unlikely that AI progress will remain tied to inefficient transformer-based models trained on massive datasets.
Second, the 10^26 operations threshold appears to be based on what may be required to train future large language models using today’s methods. However, future advances in algorithms and architectures could significantly reduce the computational demands for training such models.
The current regulation seems misguided for several reasons. First, it assumes that scaling models automatically leads to something dangerous. This is a flawed assumption, as simply increasing model size and compute does not necessarily result in harmful capabilities.
The new US Export Control proposed rule for the amount of compute used to train AI systems is open for comment. I don't think it makes much sense. It puts export controls on AI models trained with over 10^26 "operations." Here is the link: www.federalregister.gov/documents/20...
#ai #regulation
I'm looking forward to visiting RIT on Friday. I'll be giving a talk on my lab's recent works on large-scale deep learning systems for continual learning and our work on using continual learning to overcome linguistic forgetting in multi-modal LLMs.
www.rit.edu/events/cogni...
INSIGHT can produce accurate segmentations using only slide-level labels. The two images on the left show the input image and the ground truth segmentations (not used for training). The right images show the pixel-wise predictions produced by INSIGHT.
We’re excited to share INSIGHT, which integrates interpretability directly into its architecture, enabling classification and weakly supervised segmentation without pixel-level annotation.
Web: zhangdylan83.github.io/ewsmia/
arXiv: arxiv.org/abs/2412.02012
#AI #medicalAI #radiology #pathology
I agree, but I don't think it was because I was a student. I think it is because of how enormous the conferences have become. I had a ton of fun at CoLLAs-2024, but it was single-track and only had a few hundred people vs 10-20k people.
The Call for Papers for CoLLAs 2025, the premier venue for continual and lifelong learning research in AI is out: lifelong-ml.cc
The Abstract Deadline is Feb 21, 2025. It will be held in Philadelphia in August.
#continuallearning #deeplearning #lifelonglearning #ai #collas2025