Lucy Li (@lucy3) Bsky

What are the worst ethical disasters in NLP history?

(I'm teaching "ethics of NLP" tomorrow and history is good for teaching this topic.)

Most are data breaches/releases (AOL search logs, OKCupid profiles, Finnish therapy records...) but what others?

I'll put some other examples in thread --> 1/n

16 hours ago 36 11 8 0

thank you! though i'm in cs, i've heard their sociology department has some top tier alums

13 hours ago 2 0 2 0

going through my wisconsin email inbox and it is just like, tornado on tuesday, tornado on friday, faculty meeting next week

13 hours ago 10 0 1 0

you are a resource curator pro!!

5 days ago 3 0 0 0

Doing AI Differently | Culture × AI Workshop

I'll be at #ICML2026 on July 10-11 in Seoul to speak at the Workshop on Culture x AI: Evaluating AI as a Cultural Technology.

The workshop is currently accepting submissions, with humanities, ML, HCI, and social/cognitive sciences all welcome.

Submit papers by May 1! Join us in Seoul! 🇰🇷

5 days ago 43 15 2 2

I remember someone once curated a collection of lab manuals / onboarding docs to guide people who are making their own. Pointers to this?? Or pointers to your own research group's docs? Do your students use/maintain the docs?

5 days ago 8 0 2 0

As another example, I ran into some issues with inter-annotator agreement and codebook instruction wording, and then I talked to Patrick Sui (english phd student at mcgill) and he basically imploded my mind (in a good way) with insights around surface reading, literary interpretation, etc.

1 week ago 5 0 0 0

😅 For one project, I dove a bit into lexical semantics (e.g. I learned about intersective vs. subsective modifiers) when I tried to brainstorm how to get language models to decompose text into manageable bits and pieces for analysis.

1 week ago 2 0 1 0

i guess there are many other very satisfying things (like watching cats flop in beams of sunshine) but u know what i mean

1 week ago 2 0 0 0

there is nothing more satisfying than struggling to articulate some problem or concept and finding a pre-2020s paper from another field that does it very well

1 week ago 14 0 2 0

Would you realize if the book you were reading was AI? What if it was humanized to remove AI-speak?

We find that even without using stylistic cues (e.g., word choice or sentence structure) narrative choices alone give AI fiction away!

1 week ago 199 63 8 6

🛍️Major AI companies are increasingly embedding sponsored content into chatbot conversations.

Across two preregistered experiments (N=2,012), we test how effectively AI can steer consumers toward sponsored products in a realistic shopping scenario.

📝https://arxiv.org/abs/2604.04263

1 week ago 17 11 2 1

Abstract: Under the banner of progress, products have been uncritically adopted or even imposed on users — in past centuries with tobacco and combustion engines, and in the 21st with social media. For these collective blunders, we now regret our involvement or apathy as scientists, and society struggles to put the genie back in the bottle. Currently, we are similarly entangled with artificial intelligence (AI) technology. For example, software updates are rolled out seamlessly and non-consensually, Microsoft Office is bundled with chatbots, and we, our students, and our employers have had no say, as it is not considered a valid position to reject AI technologies in our teaching and research. This is why in June 2025, we co-authored an Open Letter calling on our employers to reverse and rethink their stance on uncritically adopting AI technologies. In this position piece, we expound on why universities must take their role seriously toa) counter the technology industry’s marketing, hype, and harm; and to b) safeguard higher education, critical thinking, expertise, academic freedom, and scientific integrity. We include pointers to relevant work to further inform our colleagues.

Figure 1. A cartoon set theoretic view on various terms (see Table 1) used when discussing the superset AI (black outline, hatched background): LLMs are in orange; ANNs are in magenta; generative models are in blue; and finally, chatbots are in green. Where these intersect, the colours reflect that, e.g. generative adversarial network (GAN) and Boltzmann machine (BM) models are in the purple subset because they are both generative and ANNs. In the case of proprietary closed source models, e.g. OpenAI’s ChatGPT and Apple’s Siri, we cannot verify their implementation and so academics can only make educated guesses (cf. Dingemanse 2025). Undefined terms used above: BERT (Devlin et al. 2019); AlexNet (Krizhevsky et al. 2017); A.L.I.C.E. (Wallace 2009); ELIZA (Weizenbaum 1966); Jabberwacky (Twist 2003); linear discriminant analysis (LDA); quadratic discriminant analysis (QDA).

Table 1. Below some of the typical terminological disarray is untangled. Importantly, none of these terms are orthogonal nor do they exclusively pick out the types of products we may wish to critique or proscribe.

Protecting the Ecosystem of Human Knowledge: Five Principles

Finally! 🤩 Our position piece: Against the Uncritical Adoption of 'AI' Technologies in Academia:
doi.org/10.5281/zeno...

We unpick the tech industry’s marketing, hype, & harm; and we argue for safeguarding higher education, critical
thinking, expertise, academic freedom, & scientific integrity.
1/n

7 months ago 3942 1972 111 406

Another one of @ahalterman.bsky.social and @katakeith.bsky.social's papers that I think should be cited more by CSS researchers:

What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification
arxiv.org/abs/2510.03541

2 weeks ago 32 5 1 3

I feel like I should be seeing this paper cited more often by CSS folks!!!

3 weeks ago 18 3 1 1

Opinion | Your Chatbot Isn’t a Therapist

A piece co-authored by an old friend (Divya Saini, a psychiatrist at Massachusetts General Hospital)

www.nytimes.com/2026/03/29/o...

3 weeks ago 9 1 0 0

This is super interesting to read because I was very concerned about the water and AI thing so I went to my one scientist friend who literally worked on water reg/quality/usage and asked her about AI, data centers, water etc. She was far more concerned with agriculture abuses than AI.

3 weeks ago 145 33 3 3

A slide showing that the posterior is proportional to the likelihood times the prior

I wrote a blog post on my experience using AI for slide generation

Basic idea: write your lecture notes first, then prompt the LLM to produce corresponding slides in reveal.js (h/t @chenhaotan.bsky.social). I'm picky about my slides but was happy with the results!
alexanderhoyle.com/posts/ai-sli...

3 weeks ago 62 8 4 2

'It's Personality Theft': How Creators Are Fighting Back Against AI Deepfakes As artificial intelligence becomes more accessible, creators are finding replicas of themselves promoting companies they've never even heard of.

Influencers are increasingly finding deepfaked videos of themselves online advertising services they never used.

"Anybody can create an avatar of you, and then anybody can monetize that,” said Prof. Hany Farid to @rollingstone.com.

3 weeks ago 2 1 0 0

What Does AI Do for Cultural Interpretation? A Randomized Experiment on Close Reading Poems with Exposure to AI Interpretation

Excited to share this new paper - accepted at CHI - testing the affordances of AI assistance for literary-cultural interpretation or "close reading" w/ @jiayinzhi.bsky.social @mnlee.bsky.social + @hoytlong.bsky.social. Can AI help students/people interpret cultural objects? arxiv.org/html/2603.06...

3 weeks ago 26 11 1 3

I’m seeing close to zero reaction/conversation about this on here. This is huge news for open research on language models, especially in the US.

3 weeks ago 73 17 3 2

Terms of (Ab)Use: An Analysis of GenAI Services Generative AI services like ChatGPT and Gemini are some of the fastest-growing consumer services. Individuals using such services must accept their terms of use before access, and conform to these ter...

New paper from @aial.ie! @harshp.com, Dick Blankvoort, Adel Shaaban, @sashamtl.bsky.social & me

We analysed 6 GenAI ToS--finding missing info, major power imbalances & user obligations that are impossible to meet without violating the terms

arxiv.org/abs/2603.18964 & aial.ie/research/ter...

1/

4 weeks ago 175 103 2 8

Wow a big hit for AI2 and for public interest, open source AI research

4 weeks ago 77 15 3 4

Anthropic doing what it does best: hype this as large-scale qualitative study.

Qual research is not evaluated by scale but by depth, context, and human interpretation. Stop assuming that “hand-wavey-ness” is a qual problem just because qual work doesn’t look like large-N quantitative analysis.

1 month ago 21 6 2 1

Scientific American has a piece today on NACLO (the linguistics competition) by @dodecalemma.bsky.social!

Features some quotes and puzzles by yours truly!

1 month ago 12 3 0 0

Attention NYC undergrads: Applications are open for our 13th annual Data Science Summer school at Microsoft Research NYC! Apply here by April 14th: bit.ly/3pCQENh

1 month ago 5 8 0 0

For a recent lab meeting, I wrote up a grab bag of ways to think about your development as a researcher during a PhD: emerge-lab.github.io/papers/an-un...

Sharing in case folks find it useful or have feedback!

1 month ago 102 12 6 5

Digital humanities grad students & faculty always want to offload the data cleaning drudgery on some undergrad research assistant, until they eventually come to realize that's the part of the process where the stack of qualitative assumptions comes from.

1 month ago 179 41 9 2

🥁🥁🥁 Newly out from us today in Science Advances: “Biased AI Writing Assistants Shift Users’ Attitudes on Societal Issues”.

Large Language Models are providing users with autocomplete writing suggestions on many platforms. Could these suggestions shift users’ own attitudes? (spoiler: YES) (1/7)

1 month ago 188 104 4 19

every quantitative measure is actually a stack of qualitative assumptions in a trenchcoat

1 month ago 834 233 16 34

Posts by Lucy Li