@pennengineering.bsky.social @asc.upenn.edu @duncanjwatts.bsky.social @xisong.bsky.social @dhopkins1776.bsky.social @ylelkes.bsky.social @warrencenter.bsky.social
Posts by
Benchmarks of LLM common sense overwhelmingly rely on correct labels to report an accuracy score. But what if your "ground truth" genuinely differs from mine?
In a new @pnasnexus.org paper, @duncanjwatts.bsky.social, @whiting.me and I explore the implications of this intriguing question.
🧵⤵️
Excited to see our work coming out + @joshnguyen.bsky.social & @duncanjwatts.bsky.social
After establishing a means to study common sense in humans (and finding it rather limited — common sense is not so common) in a prior paper, we wondered if the same challenge faced language models.
It does!
A screenshot of the paper's title and abstract in the journal Management Science. Title of paper: "The Task Space: An Integrative Framework for Team Research." Abstract: Research on teams spans many contexts, but integrating knowledge from heterogeneous sources is challenging because studies typically examine different tasks that cannot be directly compared. Most investigations involve teams working on just one or a handful of tasks, and researchers lack principled ways to quantify how similar or different these tasks are from one another. We address this challenge by introducing the “Task Space,” a multidimensional space in which tasks—and the distances between them—can be represented formally, and use it to create a “Task Map” of 102 crowd-annotated tasks from the published experimental literature. We then demonstrate the Task Space’s utility by performing an integrative experiment that addresses a fundamental question in team research: when do interacting groups outperform individuals? Our experiment samples 20 diverse tasks from the Task Map at three complexity levels and recruits 1,231 participants to work either individually or in groups of three or six (180 experimental conditions). We find striking heterogeneity in group advantage, with groups performing anywhere from three times worse to 60% better than the best individual working alone, depending on the task context. Critically, the Task Space makes this heterogeneity predictable: it significantly outperforms traditional typologies in predicting group advantage on unseen tasks. Our models also reveal theoretically meaningful interactions between task features; for example, group advantage on creative tasks depends on whether the answers are objectively verifiable. We conclude by arguing that the Task Space enables researchers to integrate findings across different experiments, thereby building cumulative knowledge about team performance.
When is it worth it to hire a team, compared to one competent individual?
📢 NEW PAPER (out this month in Management Science!) by me, @mark.whiting.me, @linneagandhi.bsky.social , @duncanjwatts.bsky.social, and @amaatouq.bsky.social! 🧵1/20
Join the International Society for Computational Social Science (ISCSS) for our first-ever AMA (Ask Me Anything) webinar focused on PhD admissions! When: TODAY (November 19) at 3:00 PM ET
Register in advance: umich.zoom.us/webinar/regi...
If you have a large enough monitor, you can find the article about the largest protest in US history just at the bottom of NYT. You better believe the tiny Tea Party protests were right up at the top.
My uncle Darius has been one of the most important people in my life-- constantly inspiring me with his generous spirit. Now his life depends on the generosity of a stranger who might be willing to donate an organ. Please help my family spread the word! nkr.org/XAX634
New piece on why it’s unhelpful to characterise COVID debate as ‘lockdowns are bad’ vs ‘lockdowns are good’ when it was really about people who claimed that COVID was 100x less fatal than it was + optimal to have a massive epidemic immediately before vaccine… vs those who sense checked these claims.
It's infuriating how 99% of the coverage of this new war doesn't bother to mention that it is happening because we had an agreement to limit and monitor Iran's nuclear program, which even Trump's own aides said was working, and he blew it up because it was negotiated under Obama.
We are hiring an Engagement & Policy Manager at the Penn Center for Media, Technology & Democracy! Apply+Share!
They will lead an event series, run an annual conference, manage research grants and write on the information ecosystem and its impact on society!
wd1.myworkdaysite.com/en-US/recrui...
The public still wants the rule of law: "81% of U.S. adults say that if a federal court rules that an administration action is illegal, then the administration has to follow its ruling, while 19% say the administration can ignore the ruling and continue its action." www.nbcnews.com/politics/sup...
Orwellian www.nytimes.com/2025/06/16/u...
The head of the department of homeland security announced that she planned to use military force to take over a democratically elected government in the state of California and we are arguing about whether a California Senator properly identified himself at the door.
"Normally, we treat each bullet point as a separate story. But they are all connected. We are witnessing an extraordinarily broad chilling effect in American society."
Look at the length of these lists - this effort to intimidate and silence opposition is a key part of Trump II's authoritarianism.
Re-upping this again as more people read about the Khalil case. More information to come, but nothing in WP and NYT so far contradicts original reporting. Again, it DOES NOT MATTER what you think about him or his cause. Either government is bound by the law for all of us or we're all at their mercy.
The Summer Institutes in Computational Social Science brings together graduate students, postdoctoral researchers & junior faculty for 2 weeks of intensive study & interdisciplinary research Apply by 3/7 sicss.io/2025/penn/ap...
@xisong.bsky.social
@dhopkins1776.bsky.social
@duncanjwatts.bsky.social
SICSS-Penn application is open! Deadline March 7, 2025. More information can be found ➡️ sicss.io/2025/penn/ap...
@duncanjwatts.bsky.social @dhopkins1776.bsky.social
#computational #socialscience #SICSS
CALL FOR APPLICATIONS: the Summer Institute in Computational Social Science (SICSS) at @Penn will be held for the fourth time this year! We welcome Ph.D. students, post-docs and early-career faculty to Philadelphia from June 30 to July 11 2025. Website: sicss.io/2025/penn
NYT's The Morning newsletter with the clearest explanation of the crisis we now face - expert ratings of US democracy in our Bright Line Watch survey have already plummeted to levels not seen during Trump's first term
www.nytimes.com/2025/02/07/b...