Enjoying the “Discussions” session (about online discussions — we’re not just having a “discussion”) in #CHI2026
So far —
- conversational voice agents in video calls
- visualizing comment threads
programs.sigchi.org/chi/2026/pro...
Posts by Nick Vincent
At #CHI2026? Check out @sachnishal.bsky.social's talk this Tuesday on how LLM-infused writing tools reshape journalists’ agency — their ability to exercise independent judgment in alignment with their values — in editorial decision making.
programs.sigchi.org/chi/2026/pro...
Could we govern data consent through citizen assemblies? What would that look like? We design the concept of a "consent assembly" in our new #CHI2026 paper with @linkyi.bsky.social, Paul Goelz, @robin.berjon.com 🧵
#AIGovernance #DataGovernance #AIPolicy #GDPR #AIAct #FAccT2026 #DigitalOmnibus
Going to #CHI2026 and going to try and engage and boost CHI-related stuff here (and linkedin too, as it seems like there's a lot of activity there)! Hoping to support a positive social media experience around conference-ing, as this was a big part of my positive experience at my first CHI!
Excited to be heading to Barcelona for #CHI2026 to host our workshop PoliSim: LLM Agent Simulation for Policy!
This year, we’ve seen incredible interest from researchers across HCI, NLP, CSS, and Policy. We accepted 25 outstanding papers, with 5 selected as Best Paper nominees.
So in practice, it's not a *complete* mystery, and a best case scenario would be a combination of retroactive dataset documentation + some attempt at retroactive attestation (but incomplete) + strong expectations for attestation going forward
more detailed datasheets, and at least aiming for some kind of "coarse attestation"
My read is that right now, people are assuming that pretraining for frontier models is pretty similar to open models like Olmo, just with more resources at each stage: filtering, fancy synthetic augmentation, etc.
Oh yes definitely agree this will be a sticking point! My view is that we'll probably need to accept some kind of partial "declaration of bankruptcy" re: "first gen data". If attestation / data details become important signals, we could imagine AI developers at least releasing...
Now posting blogs to Leaflet and then cross posting to substack:
- dataleverage.leaflet.pub/3mizn5hsjg5vo
- dataleverage.substack.com/p/attestatio...
New blog post: a more explicit vision of an “attestation-forward” data policy strategy for AI.
With the right approach to attestations, I think we can get a win-win-win-win (for: consumers of AI products, auditors, AI developers worried about distillation, information quality):
I think very funnily enough at the end of a long conference day my brain did a very similar merge on the interpretation side and parsed it as such (i.e. I assumed you were talking about the podcast with Josh 😂)
Some thoughts on new policy paper from OpenAI:
dataleverage.substack.com/p/people-fir...
Lots of good stuff, it's great to foster more discussion of these kinds of ideas, and I think this particular set of ideas has the potential to really enable data flow as a powerful accountability lever!
I totally mixed up the threading from phone, so went with the delete and repost strategy (but didn't repost yet 😂!). In fact, getting meta with the conference, I think I'll try to do a better organized recap using one of the longer form atproto tools and link to there!
Thread for some misc thoughts / pins from @atproto.science talks and sessions:
- big gap in demos that give potential users (eg scientists who used to post on other platforms) a “wow” moment for features that at proto enables (I had this recently with semble + margin interop)
At UBC today for @atproto.science workshop! Much to discuss.
We have an exciting panel tomorrow @11am with @nickmvincent.bsky.social Laure Haak (@verime.coop) and Ellie DeSota (@metagov.bsky.social , SciOS)! The panel will explore how governance & sustainability challenges facing the broader atproto ecosystem are mirrored in its open science applications >
Also, a meta blog (very much inspired by teaching grad research communication this term and trying to answer very "dynamic questions") about the value of using coding agents to iterate on fancy sites vs. just sharing simple bullet points, plain text posts and spreadsheets
Will likely have another blog with more thoughts on the emerging schemes for data licensing / preference signaling (and particular angle that I'm excited about, and think will actually happen: markets for attested evals and attested training data)
I'm planning to post a blog explaining the purposes of the data counterfactuals site (aimed at highlighting connections between technically-focused and socially-driven data-centric work)
Longer post(s) coming on these topics, but for those interested in "data counterfactuals" and various "data licenses" proposals I have made some big updates to two static site resources: datacounterfactuals.org and datalicenses.org
I greatly enjoyed this conversation, and I think it might be interesting to a wide range of folks interested in AI, data, collective action, etc! Thanks for having me @tbsocialist.bsky.social (and structuring a really exciting conversation, and putting together such a nice final product!)
🚨Collective action strategies in the age of AI w/ Nick Vincent
I spoke to @nickmvincent, AI researcher and author of the Data Leverage substack, about how AI systems are built on the collective output of humanity's digital labor and what we can do about it.
FULL EPISODE⬇️
🧑💻 New paper at #chi2026 w @lorenzspreen.eurosky.social and @stefanherzog.bsky.social
Are you worried about how social media algorithms affect people’s beliefs? We are, so we tested engagement-based ranking algorithms against alternatives in a pre-reg’d collaborative filtering experiment... 🧵
One q that’s come up — curious if you / the team have early thoughts on best way to handle bibliographic info. Eg say I want to make a static site with a “bib section”. I want to host the content “dynamically” in a Semble collection. Should I make cards that just link to semantic scholar or similar?
Thoughts coming soon! So far: v excited about the interop and it was really an atproto-clicks-for-me moment (when my prev collection auto-populated, very exciting). Trying out github.com/treethought/... as my local editor so I can sync/unify collection and tag taxonomies across margin & semble
Two natural allies of a "Data Transparency" agenda: capabilities forecasters and social simulators
---
dataleverage.substack.com/p/two-natura...
(Short post reacting to recent discourse around AI forecasting + social simulation; both stand to independently benefit from "Clear Data Rules")
Pretty excited about trying move more of my info consumption through semble.so, margin.at, and www.graze.social
Seems like there's some built in integration between @semble.so and margin -- very interested in figuring out a flow here (and further connect to local file workflows)