@ordinarythings.bsky.social has a better understanding of the social impacts of AI than many of the people in the industry, and is doing a great job clearly explaining these issues in an entertaining way. This is the kind of public outreach the world needs more of.
Posts by Dylan Cope
"People want more friends, sure. But if your solution to that is to build a product that makes it easier and more pleasurable to talk to no one, then fuck you. You are a misery merchant no better than a drug dealer"
youtu.be/NuIMZBseAOM?...
The way cars race up to the zebra crossing in SF is wild. I see the painted line a couple metres back from the crossing, but that seems to be a mere suggestion.
I'm blushing
Mmh I don't know if I would say they're Sagans of our time. I think it's people like Vsauce, Hank Green, 3blue1brown, Physicsgirl, smartereveryday, Simone Giertz, Veritasium, MinutePhysics, etc.
I think some people are annoyed and the baby bird response is a form of condescension. I don't like it.
I think it's good to be considerate and express gratitude if a reviewer has put in time. But you also have to make actual arguments.
I never knew a photo of someone holding a hedgehog could feel so inspirational. This looks like it should be on a political poster or something!
When people are interested in learning about how to train agents to communicate (emergent communication), I always recommend this paper as a first read: dl.acm.org/doi/10.5555/...
Attached meme summarises the main pitfall to be wary of!
Chiming into the conversation on peer-review. I think this is a good point that we need to take seriously. Science denialism has gotten a huge boost recently and many grifters benefit from well-meaning debates that they can twist into anti-intellectual narratives.
๐๐ป
e introduce the effective horizon, a property of MDPs that controls how difficult RL is. Our analysis is mo- tivated by Greedy Over Random Policy (GORP), a simple Monte Carlo planning algorithm (left) that exhaustively ex- plores action sequences of length k and then uses m random rollouts to evaluate each leaf node. The effective horizon combines both k and m into a single measure. We prove sample complexity bounds based on the effective horizon that correlate closely with the real performance of PPO, a deep RL algorithm, on our BRIDGE dataset of 155 deterministic MDPs (right).
Kind of a broken record here but proceedings.neurips.cc/paper_files/...
is totally fascinating in that it postulates two underlying, measurable structures that you can use to assess if RL will be easy or hard in an environment
I think the LLMs would generally write jax that isn't compatible with jit - lots of non-concrete shape issues. But if you know a couple patterns for doing branchless conditionals in SIMD settings it's not too hard to fix.
Or you could try aggressively prompting the LLMs ๐
For my domains it is night and day! Easily 10x speed-ups. I've been using JAX for the last 8 months, and I was using RLlib before which was very slow for my purposes.
Writing custom environments in JAX can be a bit of a pain though.
Currently I'm using:
- Custom gymnax env
- PureJAXRL
- PPO
- GRU RNNs
- wandb
- praying that my choice of hyper parameters is fine
My hopeful interpretation is that tweet is getting less engagement because we're all over here now, and not looking at Twitter!
But it also wouldn't remotely surprise me if Musk is suppressing mentions of bluesky over there.
I really hope it lasts! Feels very refreshing to see so many interesting things on the feed.
Post a non-religious photo you think of as holy.
Put differently - LLM pre training is imitation learning, and so maybe they will imitate our ability to adapt OOD?
Imo the problem is that IL is notoriously bad OOD. Not yet convinced "just scale" fixes the fundamental issue of biased demo data/compounding errors.
Managed to stump it with a drop that relies on correcting your balance with the wall. Wasn't too hard for me to get it but the agents don't get it!
Could you add me! :)
One of my first posts on twitter was "fuck twitter". I'd just like to reiterate that sentiment today, as I join bluesky
Please get the others from Novara on too ๐
It works better than Twitter!