Am I missing something here or did they train a model to spout gibberish after a specific rare token then consider it noteworthy when it works? www.anthropic.com/research/sma...
Posts by Pete Werner
Arguably RL has learnt something more general, ie what to do when encountering the plus operator, which can be applied or extrapolated to instances outside its training data.
Not familiar with the source that sparked this but take the context of SFT vs RL trying to learn the plus operator. SFT can conceivably rote learn every a + b = c, while RL could learn if a and b are numeric, put the sum after the = symbol.
The impressive thing about Gen AI is how often it actually works
Is it not about being able to validate a candidate response independent of any initial training data.
I will be visiting Atlassian in a few weeks for a panel discussion on Reinforcement Learning, come along if you’re in Sydney www.aicamp.ai/event/eventd...
Another banger from Eugene Yan
Feel like I don’t hear AGI as much as I did 3-6 months ago. I guess the checks have cleared.
No I hope you talk to someone if you think it might help and are feeling better soon either way
It actually sounds like you may be depressed
Fleshing out a proposal with ChatGPT: 5 minutes
Validating the details: 4 hours
I block a lot of words like prominent names etc. it’s just not a conversation I can meaningfully contribute to or engage with
I don’t mind Gemini but they never listen to their customers.
If you feel old, ChatGPT just told me “you’re among the ancient ones of the web.”
Dang that looks good
Meme guy dropping truths
Tom Clancy
Startup idea: Secure MCP. It’s just mcp but the logo is a padlock.
Nice!
Hot take: Apple is second only to NVIDIA when it comes to AI. They have been doing it a long time, their own hardware and importantly mature and robust software on top of it. #wwdc
I aspire to the level of brazenness whoever makes the marketing charts for NVIDIA has attained
Remember in 2016 people were going to hail a self driving Uber instead of owning a car and driving themselves
RIP Civit
An ablation study is not mathematical rigor. It’s an empirical experiment.
It’s gotta happen imo. Book to chapters, chapters to paragraphs, paragraphs to sentences, sentences to words, words to letters. Low frequency to high frequency.
If you can’t think of any good use cases for LLMs maybe you’re just boring and uncreative
If you are in Sydney this April 30 I will be giving a talk on scaling up AI services at AI Camp in Sydney. How we built and scaled the core AI services that drove our product to over 10 million users. Be sure to come along if it sounds of interest. www.aicamp.ai/event/eventd...
Open source is fine but it’s not possible to compete against someone like Google who provide production services at a loss. Unless you have funding and can do the same. Which is still putting control in the hands of the few that can run at a loss for extended periods of time.