Advertisement ยท 728 ร— 90

Posts by Dylan Cope

@ordinarythings.bsky.social has a better understanding of the social impacts of AI than many of the people in the industry, and is doing a great job clearly explaining these issues in an entertaining way. This is the kind of public outreach the world needs more of.

9 months ago 0 0 1 0
Will AI Slop Kill the Internet? | SlopWorld
Will AI Slop Kill the Internet? | SlopWorld YouTube video by Ordinary Things

"People want more friends, sure. But if your solution to that is to build a product that makes it easier and more pleasurable to talk to no one, then fuck you. You are a misery merchant no better than a drug dealer"

youtu.be/NuIMZBseAOM?...

9 months ago 0 0 1 0

The way cars race up to the zebra crossing in SF is wild. I see the painted line a couple metres back from the crossing, but that seems to be a mere suggestion.

10 months ago 0 0 0 0
Post image

I'm blushing

1 year ago 5 0 0 0

Mmh I don't know if I would say they're Sagans of our time. I think it's people like Vsauce, Hank Green, 3blue1brown, Physicsgirl, smartereveryday, Simone Giertz, Veritasium, MinutePhysics, etc.

1 year ago 2 0 1 0

I think some people are annoyed and the baby bird response is a form of condescension. I don't like it.

I think it's good to be considerate and express gratitude if a reviewer has put in time. But you also have to make actual arguments.

1 year ago 2 0 1 0

I never knew a photo of someone holding a hedgehog could feel so inspirational. This looks like it should be on a political poster or something!

1 year ago 2 0 1 0
Post image

When people are interested in learning about how to train agents to communicate (emergent communication), I always recommend this paper as a first read: dl.acm.org/doi/10.5555/...

Attached meme summarises the main pitfall to be wary of!

1 year ago 2 0 0 0

Chiming into the conversation on peer-review. I think this is a good point that we need to take seriously. Science denialism has gotten a huge boost recently and many grifters benefit from well-meaning debates that they can twist into anti-intellectual narratives.

1 year ago 2 0 0 0
Advertisement

๐Ÿ‘‹๐Ÿป

1 year ago 1 0 0 0
e introduce the effective horizon, a property of
MDPs that controls how difficult RL is. Our analysis is mo-
tivated by Greedy Over Random Policy (GORP), a simple
Monte Carlo planning algorithm (left) that exhaustively ex-
plores action sequences of length k and then uses m random
rollouts to evaluate each leaf node. The effective horizon
combines both k and m into a single measure. We prove
sample complexity bounds based on the effective horizon that
correlate closely with the real performance of PPO, a deep
RL algorithm, on our BRIDGE dataset of 155 deterministic
MDPs (right).

e introduce the effective horizon, a property of MDPs that controls how difficult RL is. Our analysis is mo- tivated by Greedy Over Random Policy (GORP), a simple Monte Carlo planning algorithm (left) that exhaustively ex- plores action sequences of length k and then uses m random rollouts to evaluate each leaf node. The effective horizon combines both k and m into a single measure. We prove sample complexity bounds based on the effective horizon that correlate closely with the real performance of PPO, a deep RL algorithm, on our BRIDGE dataset of 155 deterministic MDPs (right).

Kind of a broken record here but proceedings.neurips.cc/paper_files/...
is totally fascinating in that it postulates two underlying, measurable structures that you can use to assess if RL will be easy or hard in an environment

1 year ago 150 29 8 2

I think the LLMs would generally write jax that isn't compatible with jit - lots of non-concrete shape issues. But if you know a couple patterns for doing branchless conditionals in SIMD settings it's not too hard to fix.

Or you could try aggressively prompting the LLMs ๐Ÿ˜‚

1 year ago 0 0 0 0

For my domains it is night and day! Easily 10x speed-ups. I've been using JAX for the last 8 months, and I was using RLlib before which was very slow for my purposes.

Writing custom environments in JAX can be a bit of a pain though.

1 year ago 2 0 1 0

Currently I'm using:

- Custom gymnax env
- PureJAXRL
- PPO
- GRU RNNs
- wandb
- praying that my choice of hyper parameters is fine

1 year ago 2 0 1 0

My hopeful interpretation is that tweet is getting less engagement because we're all over here now, and not looking at Twitter!

But it also wouldn't remotely surprise me if Musk is suppressing mentions of bluesky over there.

1 year ago 2 0 0 0

I really hope it lasts! Feels very refreshing to see so many interesting things on the feed.

1 year ago 2 0 0 0
Advertisement

Post a non-religious photo you think of as holy.

1 year ago 0 0 0 0

Put differently - LLM pre training is imitation learning, and so maybe they will imitate our ability to adapt OOD?

Imo the problem is that IL is notoriously bad OOD. Not yet convinced "just scale" fixes the fundamental issue of biased demo data/compounding errors.

1 year ago 1 0 0 0
Video

Managed to stump it with a drop that relies on correcting your balance with the wall. Wasn't too hard for me to get it but the agents don't get it!

1 year ago 0 0 0 0

Could you add me! :)

1 year ago 0 0 0 0

One of my first posts on twitter was "fuck twitter". I'd just like to reiterate that sentiment today, as I join bluesky

1 year ago 58 1 2 1

Please get the others from Novara on too ๐Ÿ˜…

1 year ago 1 0 0 0

It works better than Twitter!

1 year ago 0 0 0 0