What happens in SAIL 2025 stays in SAIL 2025 -- except for these anonymized hot takes! π₯ Jotted down 17 de-identified quotes on AI and medicine from medical executives, journal editors, and academics in off-the-record discussions in Puerto Rico
irenechen.net/sail2025/
Posts by Irene Chen
We've launched a biweekly AI & Society salon at Berkeley w/ @rajiinio.bsky.social & @irenetrampoline.bsky.social! This week, sociologist marionf.bsky.social joined EECSβ beenwrekt.bsky.social to discuss The Ordinal Society. Up next, on April 16th: AI & Education. Join us at ai-and-society.github.io
Check out the paywall-free paper link below. I always enjoy working with Anna Zink (new prof starting at Tufts in fall 2025!)
ai.nejm.org/stoken/defau...
AI deployments in health are often understudied because they require time and careful analysis.βοΈπ€
We share thoughts in @ai.nejm.org about a recent AI tool for emergency dept triage that: 1) improves wait times and fairness (!), and 2) helps nurses unevenly based on triage ability
Great q! We found that standard ML training (incl rebalancing data, etc) still reflected the differences in EHR data quality. Later we started experimenting with different options such as adding self-report data to the model, but using regular ML training did not seem to help!
tl;dr Healthcare access disparities cascade through the entire ML pipeline.
Check out our working paper here: arxiv.org/pdf/2412.07712
Finding 3: Here's what helped: Adding patient self-reported data boosted model sensitivity by 11.2% for underserved patients. (Note adding the low/high access info did NOT help)
Finding 2: "Just train a better model" isn't enough. EHR reliability has large effects on balanced accuracy (5.8% drop) and sensitivity (9.4% drop).
Finding 1: For 78% of medical conditions we examined, data quality was worse for patients facing cost/time barriers to health. EHR reliability is defined compared against patient self-report
Bar chart of different barriers to healthcare
How do disparities in healthcare access affect ML models? π°ππ§ We found that low access to care -> worse EHR data quality -> worse ML performance in a dataset of 134k patients. Work with Anna Zink (on the faculty job market rn!) + Hongzhou Luan, presented at #ML4H2024
Very cool! Excited to check it out
It's giving Best Paper at the ML for Health Symposium (co-located w NeurIPS)!! π₯³ Congrats to co-authors Emily, Jin, and many others π. Check out our work using LLMs to understand liver transplants, esp understudied social and economic factors π₯π°π ! #ml4h2024
arxiv.org/pdf/2412.07924
This year our CHEN lab holiday party featured cookie decorating! π Grateful to have such creative and inspiring students and collaborators. π₯° Can you spot all of the ML-related cookies? π
jamanetwork.com/journals/jam...
Important caveats: 1) Very small sample size (6 medical cases) -> p=0.03 which is kinda sus, 2) human physicians in study had only 3 yrs of training, 3) no nuance of how to use LLMs for diag reasoning: clinical notes != clean cases; paper does not engage with this.
What do it mean to be a βlow resourcedβ language? Iβve seen definitions for less training data to low number of speakers. Great to see this important clarifying work at #EMNLP2024 from @hellinanigatu.bsky.social et al
aclanthology.org/2024.emnlp-m...
Informative recap of EMNLP papers related to multilingual models and low resource languages! Thanks @catherinearnett.bsky.social
Fairness definitions differ across groups! For white respondents, fairness = "proximity'' to assigned school. For Hispanic or Latino parents, fairness = "same rules'' for everyone. Cool work by @nilou.bsky.social + students
Thank you thank you!!
Summary of #AMIA2024 presentations related to health equity and algorithmic fairness! Thanks for pulling together @alyssapradhan.bsky.social
Super interested in this use case! Do you remember who the presenter was here?
If you trained 10 models and they had a huge variance on predictions for you, would you have any faith in the model? Enjoyed this paper defining self-consistency -- and showing enforcing that makes models more fair! Cool AAAI24 paper from A. Feder Cooper et al.
katelee168.github.io/pdfs/arbitra...
You got this Jessilyn! π₯
12 hours later, I've realized how much I've been missing a place like OldTwitter where you can share candid thoughts on research without bots clogging up the feed. Thanks Bluesky π
Giving a talk tomorrow 11:40am PT at the Simons Domain Adaptation Workshop. I'll be speaking about our recent paper on the Data Addition Dilemma! Catch the talk on live-stream or recorded afterwards
Paper: arxiv.org/pdf/2408.04154
Workshop: simons.berkeley.edu/workshops/do...
Creative AIES 2024 paper by andreawwenyi.bsky.social that uses NLP to help uncover gender bias for men/women defendants. Legal experts used NLP to build consensus and evidence on annotation rules. Could have relevant tie-ins to healthcare and bias in clinical notes
The CHEN Lab doesn't just work on cool ML+health problems! We also enjoy viewing cacti, making pasta, and climbing on chairs
irenechen.net/join-lab/
Kind words from a wise lady!
First post! I'm recruiting PhD students this PhD admission cycle who want to work on: a) impactful ML methods for healthcare π€, b) computational methods to improve health equity βοΈ, or c) AI for women's health or climate health π€°π
Apply via UC Berkeley CPH or EECS (AI-H) π.
irenechen.net/join-lab/
Can't figure out how to DM on bsky, but are you defining bioML as distinctly from clinical here?