Ramez Kouzy, MD (@ramezkouzy) Bsky

Hello world 👋
My first paper at UT Austin!

We ask: what happens when medical “evidence” fed into an LLM is wrong? Should your AI stay faithful, or should it play it safe when the evidence is harmful?

We show that frontier LLMs accept counterfactual medical evidence at face value.🧵

3 months ago 14 6 3 2

An overview of our AI-in-the-loop expert study pipeline: given a claim from a subreddit, we extract the PIO elements and retrieve the evidence automatically. The evidence, its context, and the evidence are then presented to a medical expert to provide a judgment and a rationale for the factuality of the claim.

Are we fact-checking medical claims the right way? 🩺🤔

Probably not. In our study, even experts struggled to verify Reddit health claims using end-to-end systems.

We show why—and argue fact-checking should be a dialogue, with patients in the loop

arxiv.org/abs/2506.20876

🧵1/

9 months ago 5 2 1 1

Methods and initial result

Results

More results and discussion, including comparison to proper LLM-assisted search.

Appendixes showing UI and some of the measures.

Do chatbots hinder #criticalThinking?

Contrary to fallacious causal conclusions drawn from correlational studies, this experiment found a scripted chatbot increased correct #factChecking solutions compared to unassisted students (N = 156).

doi.org/10.1016/j.ch...

#edu #tech

11 months ago 19 6 1 2

Excited to pilot our human performance augmentation approach with #AR/VR in #brachytherapy. Placing critical info in the physician's field of view eliminates workflow disruptions + is more ergonomic. Grateful for the opportunity to build this solution @mdanderson.bsky.social!

11 months ago 4 0 0 0

How scientific writing lost its soul Whenever I'm reading papers written many decades ago, I'm immediately struck by how different they feel.

blog.jakobschwichtenberg.com/p/how-scient...

1 year ago 2 0 0 0

Basic backend workflow from data fetching with analysis using dual GPT 4o mini/4o combo using prompt engineering + JSON schema output. Thinking of new features soon that would augment trial understanding for busy clinicians.

1 year ago 0 0 0 0

Happy to share one of my side projects: TRAC - Trial Reasoning and Analysis Companion. A web app I developed for augmenting trial understanding and variable visualization with GPT-4o under the hood. #AITools

1 year ago 2 0 1 0

Great summary of our latest work realized by @hyesunyun.bsky.social👇🏼We find that LLMs are susceptible to spin in medical abstracts and can propagate into plain summaries. However prompting techniques such as CoT can help mitigate that. @jessyjli.bsky.social @byron.bsky.social

1 year ago 1 0 0 0

A new completely open reasoning model out of China, Deepseek-R1, is now available. The benchmarks show it at parity with the likes of o1 and Sonnet

In some informal tests on non-code problems, it is really good, not o1-pro level but surprisingly capable (and incredibly small & fast!). Big advance.

1 year ago 138 23 3 4

Beyond Prediction: Embracing Uncertainty in the Age of AI How embracing 'So what?' might be our best response to AI uncertainty

We suck at predicting the future of AI — and that's perfectly fine. Maybe the real question isn't 'What will happen?' but 'So what?

My new post on substack goes into this more deeply as I personally struggle with how to make sense of all of this.

greypascal.substack.com/p/beyond-pre...

1 year ago 2 1 0 0

The Perfection Trap: Rethinking Our Standards for Artificial Intelligence Why our quest for flawless AI reveals more about human nature than machine capabilities

Spot on! I would go and extend beyond just firms to broadly any task. I bet even in fields like healthcare people would be surprised by the error rate of humans. This post hits the nail on the head open.substack.com/pub/greypasc...

1 year ago 0 0 0 0

Ramez Kouzy, Roxanna Attar-Olyaee, Michael K. Rooney, Comron J. Hassanzadeh, Junyi Jessy Li, Osama Mohamad
QuaLLM-Health: An Adaptation of an LLM-Based Framework for Quantitative Data Extraction from Online Health Discussions
https://arxiv.org/abs/2411.17967

1 year ago 3 1 0 0

Embracing Uncertainty: Navigating the Certainty Imperative in Healthcare Why accepting uncertainty can transform patient care—and how AI might help bridge the communication gap

Happy Thanksgiving! 🦃 I wrote something that's been on my mind for a while about how we approach uncertainty in healthcare—and how AI might help bridge this gap. Check it out here: open.substack.com/pub/greypasc...

1 year ago 3 1 0 0

Cervical cancer mortality in US women younger than 25 years significantly declined between 2016 and 2021, likely due to the widespread adoption of HPV vaccination.

ja.ma/4i9ghPC

1 year ago 2060 613 32 104

The Butterfly Nebula from Hubble

1 year ago 2281 143 21 7

The Perfection Trap: Rethinking Our Standards for Artificial Intelligence Why our quest for flawless AI reveals more about human nature than machine capabilities

I just published my first Substack piece into the ether.

Why do we demand superhuman performance from AI while normalizing human imperfection?

greypascal.substack.com/p/the-perfec...

1 year ago 4 1 0 1

Love to be added! 🙏🏼

1 year ago 0 0 0 0

Starting a list of oncology related peopl. Please tell me more to add. Or any similar listd go.bsky.app/GKXp9Fy @n8pennell.bsky.social

1 year ago 111 45 81 13

🙋🏻‍♂️🙋🏻‍♂️ please

1 year ago 0 0 1 0

#RadOnc starter pack loading! Please let me know if you would like to be added.

go.bsky.app/TbynCkm

#oncsky #bcsm #gyncsm #hncsm

1 year ago 66 20 50 2

Posts by Ramez Kouzy, MD