The main results are somewhat expected. Results from papers that were not written to be reproduced don’t always reproduce from a single sentence in an abstract. Paul E. Meehl and all of that. However it does not matter that much, as 74% of re-analysts reached the same conclusion. Science is doing ok
Posts by Vsevolod Suschevskiy
It is not clear how to organize the code base, and given the size, I am not sure if publishing all the supplemental materials makes it open and accessible. I hope they write a paper about the process itself.
The collaborative process itself remains one of the most interesting parts of this project, along with how the Multi100 team tried to provide opportunities to contribute at most stages. There are no good tools or techniques for having 494 opinionated researchers write in the same document.
I like how @briannosek.bsky.social put it in the second-to-last email I will ever get from him: “Compared with the entire SCORE project, your role may feel small. However, SCORE grew so large only because each of us contributed our part.”
A few months later, the draft was submitted to Nature, and a few months after that, it received extremely positive reviews. I don’t think I will ever see “<...> It should be published. A few thoughts for the authors to consider: <...>” regarding my own work.
Technically, I could have contributed to a spreadsheet with comments, but there were already over a thousand comments ranging from typos and punctuation marks to disagreements with the conceptual framing. I decided not to add to the core team's suffering.
Almost a year later, I got a new email: Update Before Manuscript Submission. The Multi100 team had drafted a Nature submission, and all I had to do was check my affiliation.
I ran some version of ANOVA, submitted the p-value, Cohen’s d, and degrees of freedom to a Google form, and forgot about the whole interaction.
The paper I was working with made a claim about attitudes towards an out-group after a war, but they measured it with an amount of money in a variation of the dictator game. While the analysis was fine, the generalization seemed wild to me.
After some time, I got an email saying that one of the main analysts had dropped out, and now I had a chance to look at a sentence from an abstract, load a CSV file, and try to get the same number on short notice.
I joined late, so the project had already run out of money to pay for the analysis, and the spreadsheet had run out of slots. I only put myself down as backup re-analyst 2 for a paper that was not particularly interesting.
I subscribe to a hundred mailing lists related to social science, and one day, one of them featured an ad for doing statistical analysis for money. I quickly replied, along with over 500 other researchers, and we were manually added to a spreadsheet to indicate our expertise and preferences.
Fifty percent of social and behavioural science findings are not robust to a multi-analyst approach. Kind of. This is also my first paper in Nature, where I am a co-author. Kind of.
www.nature.com/articles/s41...
Tsurushima, A., Miyano, S. Resilience in Evacuation Guidance: A Cognitive Agent Approach to System Failures. SN COMPUT. SCI. 7, 257 (2026). doi.org/10.1007/s429...
Authors test a new algorithm, that is empirically calibrated on humans in a VR experiment. All of that in a bounded rationality framework, even though they call it Trust degradation.
Evacuation guidance, multi-agent simulations and VR experiments individually are not that interesting for me, but taken together, emerge a cool paper that is more interesting than individual parts.
doi.org/10.1007/s429...
1:16:35 Who the liars are and what we need to do about them. The whole video is basically a political statment, about the connection between techology and partisanship politics. A good one.
Thanks to the co-authors.
Wong, J., Khalil, M., Suschevskiy, V. et al. Student engagement profiles in a mobile app: Links to self-regulated learning and performance. Education Tech Research Dev (2026). doi.org/10.1007/s114...
A composite statistical visualization comparing three student profiles—Disengaged (green), Utilitarian (orange), and Active (blue)—across ten engagement indicators. The chart utilizes box plots overlaid with jittered raw data points to show the distribution of 193 observations. The Y-axis represents a normalized scale (POMS) from 0.00 to 1.00. The X-axis displays ten variables: Attempts, Successful attempts, Unique days, Consecutive days, Use of compete/battle modes, Early start, Late finish, Maximum time, Mean time, and Minimum time. Key trends show that Active learners generally display higher median values and wider distributions in volume-based metrics (like Attempts and Unique days) compared to Disengaged learners. Three annotated arrows provide context for specific data points: one highlighting a maximum of 4090 attempts, another noting 14% of attempts in gamified modes, and a third defining "Late finish" as 2 days before the exam.
I created my most complicated visualization to show 193 observations across ten continuous and categorical axes. It is messy, but I like my arrows and Epanechnikov’s kernel.
That was the longest road I have taken so far. Five years have passed since I first saw the data, and even longer since it was collected. Conceptually, I have moved from a classical "squeeze the maximum out of the data" mindset to a more transparent post-positivism. The data was still tortured.
The bottom line is that practice is king. We found that students used the study app in very different ways: some consistently throughout the class, others only to cram before the exam, and some barely or not at all. Those who used the app for self-regulated learning outperformed those who didn’t.
Hi. I have a poster about a framework to design digital experiments with open science practices. If you do experiments, let's talk. #ic2s2
Hi @dncwn.bsky.social, could you please delete the link to the poster awards poll? We were trying to limit voting to only in-person participants
But funding...
How opinion dynamics scholars see humans when they use the percolation model. #IC2S2 poster session day 2.
Seat selection at #IC2S2 De Geerhallen during keynotes is not normal. Bimodal, perhaps? Talks are uniformly excellent and properly powered
Happy to be at @ic2s2.bsky.social
So excited to see amazing keynote speakers, given how hard it is to travel to the Norrköping #IC2S2
James Evans discussed Generative Social Science with Large Models at @iasliu.bsky.social. Culture as a multidimensional bias and how (computational) social science changes from hypotheses to hunches: from precise models to feeling about how X influences Y.
I just completed the SICSS-IAS 2025 digital traces lab hosted by @iasliu.bsky.social. I made this tweet using the bskyr library in R. Kudos to @chriskenny.bsky.social for creating and maintaining it! I will leave all hashtags here: #SICSS #SICSS-IAS #IAS.
PhD student -> PhD candidate