Advertisement · 728 × 90

Posts by

@pennengineering.bsky.social @asc.upenn.edu @duncanjwatts.bsky.social @xisong.bsky.social @dhopkins1776.bsky.social @ylelkes.bsky.social @warrencenter.bsky.social

2 months ago 1 1 0 0
Post image

Benchmarks of LLM common sense overwhelmingly rely on correct labels to report an accuracy score. But what if your "ground truth" genuinely differs from mine?

In a new @pnasnexus.org paper, @duncanjwatts.bsky.social, @whiting.me and I explore the implications of this intriguing question.

🧵⤵️

2 months ago 8 3 1 1

Excited to see our work coming out + @joshnguyen.bsky.social & @duncanjwatts.bsky.social

After establishing a means to study common sense in humans (and finding it rather limited — common sense is not so common) in a prior paper, we wondered if the same challenge faced language models.

It does!

2 months ago 5 1 0 0
A screenshot of the paper's title and abstract in the journal Management Science. Title of paper: "The Task Space: An Integrative Framework for Team Research." Abstract: Research on teams spans many contexts, but integrating knowledge from heterogeneous sources is challenging because studies typically examine different tasks that cannot be directly compared. Most investigations involve teams working on just one or a handful of tasks, and researchers lack principled ways to quantify how similar or different these tasks are from one another. We address this challenge by introducing the “Task Space,” a multidimensional space in which tasks—and the distances between them—can be represented formally, and use it to create a “Task Map” of 102 crowd-annotated tasks from the published experimental literature. We then demonstrate the Task Space’s utility by performing an integrative experiment that addresses a fundamental question in team research: when do interacting groups outperform individuals? Our experiment samples 20 diverse tasks from the Task Map at three complexity levels and recruits 1,231 participants to work either individually or in groups of three or six (180 experimental conditions). We find striking heterogeneity in group advantage, with groups performing anywhere from three times worse to 60% better than the best individual working alone, depending on the task context. Critically, the Task Space makes this heterogeneity predictable: it significantly outperforms traditional typologies in predicting group advantage on unseen tasks. Our models also reveal theoretically meaningful interactions between task features; for example, group advantage on creative tasks depends on whether the answers are objectively verifiable. We conclude by arguing that the Task Space enables researchers to integrate findings across different experiments, thereby building cumulative knowledge about team performance.

A screenshot of the paper's title and abstract in the journal Management Science. Title of paper: "The Task Space: An Integrative Framework for Team Research." Abstract: Research on teams spans many contexts, but integrating knowledge from heterogeneous sources is challenging because studies typically examine different tasks that cannot be directly compared. Most investigations involve teams working on just one or a handful of tasks, and researchers lack principled ways to quantify how similar or different these tasks are from one another. We address this challenge by introducing the “Task Space,” a multidimensional space in which tasks—and the distances between them—can be represented formally, and use it to create a “Task Map” of 102 crowd-annotated tasks from the published experimental literature. We then demonstrate the Task Space’s utility by performing an integrative experiment that addresses a fundamental question in team research: when do interacting groups outperform individuals? Our experiment samples 20 diverse tasks from the Task Map at three complexity levels and recruits 1,231 participants to work either individually or in groups of three or six (180 experimental conditions). We find striking heterogeneity in group advantage, with groups performing anywhere from three times worse to 60% better than the best individual working alone, depending on the task context. Critically, the Task Space makes this heterogeneity predictable: it significantly outperforms traditional typologies in predicting group advantage on unseen tasks. Our models also reveal theoretically meaningful interactions between task features; for example, group advantage on creative tasks depends on whether the answers are objectively verifiable. We conclude by arguing that the Task Space enables researchers to integrate findings across different experiments, thereby building cumulative knowledge about team performance.

When is it worth it to hire a team, compared to one competent individual?

📢 NEW PAPER (out this month in Management Science!) by me, @mark.whiting.me, @linneagandhi.bsky.social , @duncanjwatts.bsky.social, and @amaatouq.bsky.social! 🧵1/20

3 weeks ago 4 1 1 0

Join the International Society for Computational Social Science (ISCSS) for our first-ever AMA (Ask Me Anything) webinar focused on PhD admissions! When: TODAY (November 19) at 3:00 PM ET
Register in advance: umich.zoom.us/webinar/regi...

5 months ago 2 0 0 0
Post image

If you have a large enough monitor, you can find the article about the largest protest in US history just at the bottom of NYT. You better believe the tiny Tea Party protests were right up at the top.

6 months ago 101 20 6 1
Preview
Darius Jonathan Needs a Kidney | Can You Help? Darius Jonathan Needs a Kidney | Can You Help?

My uncle Darius has been one of the most important people in my life-- constantly inspiring me with his generous spirit. Now his life depends on the generosity of a stranger who might be willing to donate an organ. Please help my family spread the word! nkr.org/XAX634

9 months ago 2 2 0 0
Preview
Restrictions had unequal impacts, but the pandemic itself did too. - Boston Review Adam Kucharski responds to Stephen Macedo and Frances Lee.

New piece on why it’s unhelpful to characterise COVID debate as ‘lockdowns are bad’ vs ‘lockdowns are good’ when it was really about people who claimed that COVID was 100x less fatal than it was + optimal to have a massive epidemic immediately before vaccine… vs those who sense checked these claims.

10 months ago 638 191 22 23

It's infuriating how 99% of the coverage of this new war doesn't bother to mention that it is happening because we had an agreement to limit and monitor Iran's nuclear program, which even Trump's own aides said was working, and he blew it up because it was negotiated under Obama.

10 months ago 1680 727 41 34
Preview
Engagement and Policy Manager - Penn Center for Media, Technology, and Democracy University Overview The University of Pennsylvania, the largest private employer in Philadelphia, is a world-renowned leader in education, research, and innovation. This historic, Ivy League school co...

We are hiring an Engagement & Policy Manager at the Penn Center for Media, Technology & Democracy! Apply+Share!

They will lead an event series, run an annual conference, manage research grants and write on the information ecosystem and its impact on society!

wd1.myworkdaysite.com/en-US/recrui...

10 months ago 3 3 0 2
Advertisement
Post image

The public still wants the rule of law: "81% of U.S. adults say that if a federal court rules that an administration action is illegal, then the administration has to follow its ruling, while 19% say the administration can ignore the ruling and continue its action." www.nbcnews.com/politics/sup...

10 months ago 117 35 4 4
Post image

Orwellian www.nytimes.com/2025/06/16/u...

10 months ago 2039 549 56 43

The head of the department of homeland security announced that she planned to use military force to take over a democratically elected government in the state of California and we are arguing about whether a California Senator properly identified himself at the door.

10 months ago 7201 2317 99 84

"Normally, we treat each bullet point as a separate story. But they are all connected. We are witnessing an extraordinarily broad chilling effect in American society."

Look at the length of these lists - this effort to intimidate and silence opposition is a key part of Trump II's authoritarianism.

1 year ago 930 347 29 8

Re-upping this again as more people read about the Khalil case. More information to come, but nothing in WP and NYT so far contradicts original reporting. Again, it DOES NOT MATTER what you think about him or his cause. Either government is bound by the law for all of us or we're all at their mercy.

1 year ago 1721 487 14 10
Post image

The Summer Institutes in Computational Social Science brings together graduate students, postdoctoral researchers & junior faculty for 2 weeks of intensive study & interdisciplinary research Apply by 3/7 sicss.io/2025/penn/ap...
@xisong.bsky.social
@dhopkins1776.bsky.social
@duncanjwatts.bsky.social

1 year ago 5 3 0 0

SICSS-Penn application is open! Deadline March 7, 2025. More information can be found ➡️ sicss.io/2025/penn/ap...
@duncanjwatts.bsky.social @dhopkins1776.bsky.social
#computational #socialscience #SICSS

1 year ago 7 2 1 3
Advertisement
Summer Institute in Computational Social Science

CALL FOR APPLICATIONS: the Summer Institute in Computational Social Science (SICSS) at @Penn will be held for the fourth time this year! We welcome Ph.D. students, post-docs and early-career faculty to Philadelphia from June 30 to July 11 2025. Website: sicss.io/2025/penn

1 year ago 4 1 0 1
Post image Post image Post image

NYT's The Morning newsletter with the clearest explanation of the crisis we now face - expert ratings of US democracy in our Bright Line Watch survey have already plummeted to levels not seen during Trump's first term
www.nytimes.com/2025/02/07/b...

1 year ago 497 214 29 13