Valerio Pepe (@valeriopepe) Bsky

With some trepidation, I'm putting this out into the world:
gershmanlab.com/textbook.html
It's a textbook called Computational Foundations of Cognitive Neuroscience, which I wrote for my class.

My hope is that this will be a living document, continuously improved as I get feedback.

3 months ago 591 238 16 10

It has been so so fun to think with some of my favorite scientists about what it means to understand!

4 months ago 54 9 0 1

Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost.

Paper, code & demos: gabegrand.github.io/battleship

Here's what we learned about building rational information-seeking agents... 🧵🔽

5 months ago 25 11 1 2

I'm really excited about this work (two years in the making!).

We look at how LLMs seek out and integrate information and find that even GPT-5-tier models are bad at this, meaning we can use Bayesian inference to uplift weak LMs and beat them... at 1% of the cost 👀

5 months ago 3 0 0 0

Title page of the paper: WUGNECTIVES: Novel Entity Inferences of Language Models from Discourse Connectives, with two figures at the bottom Left: Our figure 1 -- comparing previous work, which usually predicted the connective given the arguments (grounded in the world); our work flips this premise by getting models to use their knowledge of connectives to predict something about the world. Right: Our main results across 7 types of connective senses. Models are especially bad at Concession connectives.

"Although I hate leafy vegetables, I prefer daxes to blickets." Can you tell if daxes are leafy vegetables? LM's can't seem to! 📷

We investigate if LMs capture these inferences from connectives when they cannot rely on world knowledge.

New paper w/ Daniel, Will, @jessyjli.bsky.social

6 months ago 32 10 2 2

Common Descent Machine learning begins with the perceptron

It’s week 4 and probably time to start doing machine learning in machine learning class. We begin with the only nice thing we have: the perceptron.

6 months ago 39 3 0 0

Out of curiosity (and my own ignorance), how are teachers aware of students' socioeconomic backgrounds when the students are this young?

I can think of clothing as an immediate signal, and, over time, getting to know parents (and thus their occupations). Are these the main ways this is inferred?

7 months ago 1 0 1 0

Can LLMs learn social skills by playing games? A new open-source framework, TextArena, pits large language models against each other in competitive environments designed to test and improve their communication skills.

Can LLMs learn social skills by playing games?
A blogpost on human-model interaction, games, training and testing LLMs
research.ibm.com/blog/LLM-soc...
🤖📈🧠

8 months ago 5 1 0 0

> looking for a coffee
> have to judge if their coffee is burnt or flavorful
> "we have a Cimbali coffee machine"

> buy coffee

> it's burnt

9 months ago 1 0 0 0

Emergent Misalignment on a Budget — LessWrong TL;DR We reproduce emergent misalignment (Betley et al. 2025) in Qwen2.5-Coder-32B-Instruct using single-layer LoRA finetuning, showing that tweaking…

We take this as evidence that while misalignment directions may exist, the narrative is probably quite nuanced, and EM is not governed by a single vector, as some hypothesized in the aftermath of the original paper.

See it for yourself at:
www.lesswrong.com/posts/qHudHZ...

10 months ago 2 0 0 0

However, the steered models often are more incoherent than the finetuned ones, suggesting that emergent misalignment is not entirely guided by a steering vector. The vectors themselves are also not very interpretable, so it is unclear what exactly they are capturing.

10 months ago 0 0 1 0

The answer is: yes (sort of).

Though the finetune itself seems to be learning more than a single steering vector, extracting steering vectors and applying them (with sufficient scaling) to the same layer in an un-finetuned version of the model *does* elicit misaligned behavior.

10 months ago 0 0 1 0

We finetuned a single layer, and show that on certain layers, this process renders the model nearly as misaligned as a full-layer finetune. This allows us to ask: can we capture this misalignment in a single steering vector taken from the layer?

10 months ago 0 0 1 0

An interpretation of the original paper was that EM is mediated by a “misalignment direction” within the model, which the finetuning process changes, rendering the model much more toxic/misaligned.

10 months ago 0 0 1 0

Emergent Misalignment on a Budget — LessWrong TL;DR We reproduce emergent misalignment (Betley et al. 2025) in Qwen2.5-Coder-32B-Instruct using single-layer LoRA finetuning, showing that tweaking…

New blog post! www.lesswrong.com/posts/qHudHZ...

Following Emergent Misalignment, we show that finetuning even a single layer via LoRA on insecure code can induce toxic outputs in Qwen2.5-Coder-32B-Instruct, and that you can extract steering vectors to make the base model similarly misaligned 🧵

10 months ago 1 0 1 0

Sam is 100% correct on this. Indeed, human babies have essential cognitive priors such as permanence, continuity, and boundary of objects, 3D Euclidean understanding of space, etc.

We spent 2 years to systematically to examine and show the lack of such in MLLMs: arxiv.org/abs/2410.10855

10 months ago 21 5 0 0

I think the BabyLM Challenge is really interesting, but also feel that there is something fundamentally ill-posed about how it maps onto the challenge facing human children. It's true that babies only get a relatively limited amount of linguistic experience, but...

11 months ago 40 7 3 2

Word learning is usually about what a word does refer to. But can toddlers learn from what it doesn’t?

Our new Cognition paper shows 20-month-olds use negative evidence to infer novel word meanings, reshaping theories of language development.

www.sciencedirect.com/science/arti...

11 months ago 33 12 0 0

ai is truly revolutionary -- scientists hadn't previously considered what would happen if sally had simply eaten the marble instead, to know its location at all times

11 months ago 6 0 1 0

Let's go Lio!!!

11 months ago 0 0 0 0

fair, italy has some incredibly creative offensive slang -- fwiw my favorite usable roman insult ("porco dio" is too offensive for casual use) is "sei 'na pentola de facioli", "you are a pot of beans", i.e. you never stop muttering and talking

1 year ago 1 0 0 0

as someone from Rome I'm currently sitting at my laptop like the mentats from Dune trying to figure out what words this could be referring to

we also take pride in preparing gnocchi incorrectly because the rest of italy can't make a decent carbonara to save their lives (no cream and no parmesan!)

1 year ago 1 0 1 0

congratulations!!

1 year ago 1 0 0 0

the icml keynote will be jensen huang speaking to an empty room

1 year ago 1 0 0 0

Photoshop for text. In our #CHI2025 paper “Textoshop”, we explore how interactions inspired by drawing software can help edit text. We consider words as pixels, sentences as regions, and tones as colours. #HCI #NLProc #LLMs #AI Thread 🧵
(1/10)

1 year ago 54 16 2 7

(also holy loud sound effects batman)

1 year ago 0 0 0 0

saw a new pika model was out on twitter & robot-gasoline-bench does not disappoint

1 year ago 2 0 1 0

All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc

1 year ago 107 37 1 3

(1/5) Very excited to announce the publication of Bayesian Models of Cognition: Reverse Engineering the Mind. More than a decade in the making, it's a big (600+ pages) beautiful book covering both the basics and recent work: mitpress.mit.edu/978026204941...

1 year ago 521 120 15 15

Hello bsky! I'm Valerio, an undergrad at Harvard studying computer science and cognitive science.

I'm interested in the inductive biases that make language learning and reasoning so easy for us humans, and what their analogues are in machines.

If you're around Boston, I would love to grab coffee!

1 year ago 2 0 0 0

Posts by Valerio Pepe