Advertisement · 728 × 90

Posts by Siva Reddy

Post image

Our new paper in #PNAS (bit.ly/4fcWfma) presents a surprising finding—when words change meaning, older speakers rapidly adopt the new usage; inter-generational differences are often minor.

w/ Michelle Yang, ‪@sivareddyg.bsky.social‬ , @msonderegger.bsky.social‬ and @dallascard.bsky.social‬👇(1/12)

8 months ago 33 17 3 2

Age doesn't matter to pick up new word usages. The pronunciation may sound odd across generations but not the semantics 👴👵👨👩

8 months ago 5 0 0 0
Post image

🗓️ Save the date! It's official: The VLMs4All Workshop at #CVPR2025 will be held on June 12th!

Get ready for a full day of speakers, posters, and a panel discussion on making VLMs more geo-diverse and culturally aware 🌐

Check out the schedule below!

10 months ago 4 3 0 1
Preview
Language Models Largely Exhibit Human-like Constituent Ordering Preferences Though English sentences are typically inflexible vis-à-vis word order, constituents often show far more variability in ordering. One prominent theory presents the notion that constituent ordering is ...

The paper will be presented today orally at 4:30--4:45.

Read the paper here: arxiv.org/abs/2502.05670

11 months ago 1 0 0 0
Preview
Language Models Largely Exhibit Human-like Constituent Ordering Preferences Though English sentences are typically inflexible vis-à-vis word order, constituents often show far more variability in ordering. One prominent theory presents the notion that constituent ordering is ...

Ada is an undergrad and will soon be looking for PhDs. Gaurav is a PhD student looking for intellectually stimulating internships/visiting positions. They did most of the work without much of my help. Highly recommend them. Please reach out to them if you have any positions.

11 months ago 6 2 1 0

Humans have a tendency to move heavier constituents to the end of the sentence. While LLMs show similar behaviour, what's surprising is that pretrianed models behave closer to humans than instruction-tuned models. And syllables rather than tokens define a better metric to define the heaviness.

11 months ago 1 0 1 0

Incredibly proud of my students @adadtur.bsky.social and Gaurav Kamath for winning a SAC award at #NAACL2025 for their work on assessing how LLMs model constituent shifts.

11 months ago 17 5 1 0
Advertisement

Great work from labmates on LLMs vs humans regarding linguistic preferences: You know when a sentence kind of feels off e.g. "I met at the park the man". So in what ways do LLMs follow these human intuitions?

11 months ago 7 3 0 0

List of #SafetyGuaranteedLLMs talks on Monday Apr 14 2025 PDT. Speakers @rogergrosse.bsky.social Boaz Barak, Ethan Perez, Georgios Piliouras

1 year ago 4 0 0 0
Post image

The most exciting event on LLM safety is happening this week at @simonsinstitute.bsky.social with many excellent speakers. Organized by @yoshuabengio.bsky.social et al. Join us in person or virtual. In collaboration with @ivado.bsky.social. More details here:

simons.berkeley.edu/workshops/sa...

1 year ago 7 2 0 1
Post image

Though in-person registration is now full, you can still register to view the private livestream for next week's workshop on Safety-Guaranteed LLMs, co-organized with @ivado.bsky.social. We'll be posting live here as well.

simons.berkeley.edu/workshops/sa...

1 year ago 4 2 0 0

sorry to hear but please don't boycott us. We are having a tough time with US already :). I hate the new system too. Earlier it was just a pdf. You can just send the report to the supervisor with pass/fail and feedback and perhaps they can take care from there.

1 year ago 1 0 1 0

Never been part of a project like this before - it was a very rewarding+unique experience!

Everyone in the lab contributed different chapters and it was much more exploratory than your average phd project.

My chapter studied R1's reasoning on "image generation/editing" (via ASCII) 🧵👇

1/N

1 year ago 13 2 1 1

I will be giving a talk about this work @SimonsInstitute tomorrow (Apr 2nd 3PM PT). Join us, both in-person or virtually.

simons.berkeley.edu/workshops/fu...

1 year ago 6 0 0 0

Introducing the DeepSeek-R1 Thoughtology -- the most comprehensive study of R1 reasoning chains/thoughts ✨. Probably everything you need to know about R1 thoughts. If we missed something, please let us know.

1 year ago 17 4 0 1
Post image

A bit of a mess around the conflict of COLM with the ARR (and to lesser degree ICML) reviews release. We feel this is creating a lot of pressure and uncertainty. So, we are pushing our deadlines:

Abstracts due March 22 AoE (+48hr)
Full papers due March 28 AoE (+24hr)

Plz RT 🙏

1 year ago 37 31 3 2
Post image Post image

As someone who has tried to make even basic image editing work in my research (e.g. "move cup to left of table"):
Gemini's new editing capabilities are seriously impressive!

Playing around with it is quite fun...
Edit 1: "edit the image to contain 3 more people"

1 year ago 9 1 3 0
Advertisement

Why do LLMs have a hard time aligning, while humans are better at it? 🌟The answer lies in the lack of a societal alignment framework for LLMs 🌍.

Incredible effort by @karstanczak.bsky.social in pulling views from multiple disciplines and experts in these fields.

arxiv.org/abs/2503.00069

1 year ago 7 0 0 0

How to Get Your LLM to Generate Challenging
Problems for Evaluation? 🤔 Check out our CHASE recipe. A highly relevant problem given that most human-curated datasets are crushed within days.

1 year ago 4 2 0 0

Finally it's handy that all my twitter posts got migrated here to bsky:

I'll be presenting AURORA at @neuripsconf.bsky.social on Wednesday!

Come by to discuss text-guided editing (and why imo it is more interesting than image generation), world modeling, evals and vision-and-language reasoning

1 year ago 23 2 1 0

Congratulations
@andreasmadsen.bsky.social
on successfully defending your PhD ⚔️ 🎉🎉 Grateful to you for stretching my interests into interpretability and engaging me with exciting deas. Good luck with your mission on building faithfully interpretable models.

1 year ago 9 0 0 0

Stages of #ICLR reviewing:
Stage 1: 😍 I hope I learn something new
Stage 2: 🤗 I hope I am constructive enough while being critical. Submits review
Stage 3: 🤯 Receives 5 page response + revision with many new pages
Stage 4: 😱 Crap, how do I get out of this?
Stage 5: 😵‍💫 What year is it?

1 year ago 17 0 0 0
How to Build Good Language Modeling Benchmarks Building benchmarks is important because they shine a spotlight on the weaknesses of existing language models and so can guide the community on how to improve them.

I wrote some thoughts on how to build good LM benchmarks: ofir.io/How-to-Build...

1 year ago 76 8 5 6

@sivareddyg.bsky.social Which platforms? Maybe consider @buffer.com

1 year ago 1 1 0 0

Nice! Hello friend. Long time!

1 year ago 1 0 1 0

It's beautiful to start from scratch sometimes 😇

1 year ago 40 2 1 0
Advertisement

Creating a 🦋 starter pack for people working in IR/RAG: go.bsky.app/88ULgwY

I can’t seem to find everyone though, help definitely appreciated to fill this out (DM or comment)!

1 year ago 86 23 32 1

I am a lazy bsky :) or whatever you call it now.

1 year ago 0 0 0 0
Preview
Bluesky Social media as it should be. Find your community among millions of users, unleash your creativity, and have some fun again.

I find it unintuitive that user handles have to be appended with bsky.social? Can we get rid of it?

1 year ago 8 0 3 0