New report is out with the latest open model adoption data we have gathered for Interconnects & The ATOM Project. At the surface level, we can see Chinese models continuing to accelerate in adoption. The report details much more.
atomproject.ai/report
Posts by Mike Trizna
Good news for anyone working on their proposals for the next Fantastic Futures conference - the deadline is extended to April 16! ai4lam.org/submission-i...
#FF2026 is in the US, but will be very hybrid so you don't need to travel there to present or attend many sessions #AI4LAM #MuseTech
So cool! Did you see this recent post from @lelandmcinnes.bsky.social?: bsky.app/profile/lela.... I don't know about GPU acceleration yet, but EVoC might be worth testing out as a replacement for UMAP. I'm going to give it a shot on a separate project later today.
EVoC is a library designed specifically for fast clustering of high dimensional embedding vectors. It can produce high quality clusters extremely efficiently, and requires little to no hyperparameter tuning.
Better clustering than UMAP + HDBSCAN; faster clustering than KMeans.
You never know what data will be used for!
I uploaded a @britishlibrary.bsky.social dataset to Hugging Face in 2022. IIRC one of my first PR to a HF repo!
4 years later, someone trains a Victorian chatbot on it
More libraries should be sharing their public domain collections for AI to build on!
Screenshot of embedding atlas showing the embedding view on the left, a table at the bottom and charts on the right.
🚀 We've just open-sourced Embedding Atlas – a tool for exploring large embedding spaces through rich, interactive visualizations 📊.
This is uncanny! Firing up huggingface.co/spaces/MikeT... to see if these actually came from BHL, or if ChatGPT learned the style (or both??)
The best path forward in AI requires technologists to be reflective/self-critical about how their work impacts society. Transparency helps this. Appreciate Bsky for flagging AI ethics &my colleague’s response. Let’s make informed consent a real thing. More later; Recommend: bsky.app/profile/cfie...
Super excited to announce our best open-source language models yet. OLMo 2.
These instruct models are hot off the press -- finished training with our new RL method this morning and vibes are very good.
I like this new analogy for working with LLMs by @emollick.bsky.social
"treat AI like an infinitely patient new coworker who forgets everything you tell them each new conversation, one that comes highly recommended but whose actual abilities are not that clear"
www.oneusefulthing.org/p/getting-st...
Big milestone for Project Jupyter 🚨:
Jupyter has finalized the creation of the Jupyter Foundation, hosted by @linuxfoundation.org.
www.linuxfoundation.org/press/linux-...
Bluesky uses AI internally to assist in content moderation, which helps us triage posts and shield human moderators from harmful content. We also use AI in the Discover algorithmic feed to serve you posts that we think you’d like.
None of these are Gen AI systems trained on user content.