Advertisement · 728 × 90

Posts by Karandeep Singh

Horizontal bar chart comparing agentic coding reliability across three groups. Frontier models (Claude Sonnet 4.5, Gemini Pro 3.1, GPT 4.1) score 80-100% correct. Four-months-ago's local models (Qwen 3 14B, GPT OSS 20B, Mistral 3.1 24B) all score 0%. Today's local models (Gemma 4 26B-A4B, Qwen 3.5 35B-A3B) both score 90%.

Horizontal bar chart comparing agentic coding reliability across three groups. Frontier models (Claude Sonnet 4.5, Gemini Pro 3.1, GPT 4.1) score 80-100% correct. Four-months-ago's local models (Qwen 3 14B, GPT OSS 20B, Mistral 3.1 24B) all score 0%. Today's local models (Gemma 4 26B-A4B, Qwen 3.5 35B-A3B) both score 90%.

A few months ago, any LLM that I could run on my Macbook scored 0% on an agentic coding eval I put together. This month's Qwen 3.5 and Gemma 4 releases both scored 90%.

On my blog: simonpcouch.com/blog/2026-04...

4 days ago 108 12 4 2
Preview
Pancreatic cancer mRNA vaccine shows lasting results in an early trial Scientists caution that more research is needed, but nearly all of the patients who responded to the personalized vaccine are still alive six years later.

"Pancreatic cancer mRNA vaccine shows lasting results in an early trial: Scientists caution that more research is needed, but nearly all of the patients who responded to the personalized vaccine are still alive six years later."

2 days ago 9701 2890 157 600

In the year 2126…

Reporter: What do you think about deskilling?

Claude: *thinking*

Claude: I don’t see why you would do it. The SKILL.md file takes up minimal storage space—why would you delete it?

2 days ago 3 0 0 0

Aspired: Skill.MDPhD

4 days ago 2 1 0 0

Inspired: Skill.PhD

4 days ago 2 1 1 0

It sounds like Anthropic was right to be concerned.

4 days ago 3 0 0 0

Tired: Skill.MD

Wired: Dr. Skill

4 days ago 6 1 1 0

bsky.app/profile/kdps...

4 days ago 1 0 1 0

Is no one upset about the fact that they never sold birds in the first place?

5 days ago 3 0 0 1

Statisticians: “Data” is plural but “dataset” (a set of multiple data) is singular. Thank you for your attention to this matter.

5 days ago 10 1 0 2
Advertisement
Post image

House full on amazing panel discussion on Clinical Reasoning in the Era of AI at CDI2 conference at UCI @kdpsingh.bsky.social

6 days ago 2 3 0 0
Home

If you work with UC Health data or aspire to do so then it might be worth attending the CDI2 conference at UCI

na.eventscloud.com/website/5224...

1 week ago 2 1 0 1

Schrodinger’s strait

1 week ago 11 0 0 0

Sales rep: The customer is always right.

People: Wow… that’s such good advice.



AI: The customer is always right.

People: This is a phenomenon unique to AI which we have dubbed “sycophancy.” We have no idea where AI got this from.

1 week ago 3 0 1 0
Open VSX Registry

There's a third party Julia extension for Positron! It makes Julia a peer of R and Python in Positron. open-vsx.org/extension/nt...

1 week ago 5 3 1 1
Home

UC Health CDI2 upcoming conference!

Thriving through change: Adapting, Advancing, Achieving

Stellar listening of speakers, likes of which include @kdpsingh.bsky.social!

Register
👇

na.eventscloud.com/website/5224...

2 months ago 2 2 1 1
Video

Tiny personal news - I am now one of the Lead Maintainers for Model Context Protocol 👋

More work ahead, so if you have any feedback about the protocol, its docs, SDKs, or anything at all - let me know!

1 week ago 57 1 3 1
Screenshot of a Mac desktop tiled with eight windows, each running the same Shiny dashboard inside an Electron app under a different runtime configuration. Every window shows four colored summary cards at the top (Backend/Runtime, R or Python Version, Platform, and Packages) above an Interactive Plot with a slider and scatter chart, plus a Runtime Details panel underneath. The eight variants pair R Shiny and Python Shiny with different packaging strategies, shinylive, system install, bundled runtime, auto-download, and container, running across Darwin/arm64, Linux, and Emscripten/wasm32 architectures, demonstrating that the same Shiny app can be shipped as a desktop application through any of these modes.

Screenshot of a Mac desktop tiled with eight windows, each running the same Shiny dashboard inside an Electron app under a different runtime configuration. Every window shows four colored summary cards at the top (Backend/Runtime, R or Python Version, Platform, and Packages) above an Interactive Plot with a slider and scatter chart, plus a Runtime Details panel underneath. The eight variants pair R Shiny and Python Shiny with different packaging strategies, shinylive, system install, bundled runtime, auto-download, and container, running across Darwin/arm64, Linux, and Emscripten/wasm32 architectures, demonstrating that the same Shiny app can be shipped as a desktop application through any of these modes.

Holy (native) Grail update: we cracked open the #Shiny temple and eight desktop apps tumbled out. #rstats, #python, native, containerized, shinylive, you name it. One #Electron shell, every runtime mode we could dream up, all from one R package.

{shinyelectron}, coming soon to a desktop near you.

2 weeks ago 36 2 0 0
Post image

〽️

2 weeks ago 0 0 0 0
Advertisement
Post image

Health systems may have a blind spot if they aren’t thinking about AI consumer tool usage in caring for patients.
New @ai.nejm.org editorial: When physicians use generative AI without institutional engagement, systems lose visibility and safeguards.

Read ⤵️
ai.nejm.org/doi/full/10....

3 weeks ago 2 2 0 0
Preview
Introducing ggauto: automating better charts – Nicola Rennie The ggauto package is an opinionated ggplot2 extension package that aims to help people make better charts by default. This blog post explains why it exists and how it works.

🎉 ggauto is now on CRAN 🎉

An #RStats package that selects better chart types, and provides more accessible styling for #ggplot2 plots 📊

Blog post explaining why I made it and how it works: nrennie.rbind.io/blog/introdu...

#DataViz

3 weeks ago 194 62 7 2

Watching my kid manually type in references in a middle school essay because using a reference manager isn’t allowed. That is the most cursed thing ever.

3 weeks ago 8 0 4 0
Preview
Modern Cyber Defenses: Resilient networks mean healthier patients and uninterrupted care The Advanced Research Projects Agency for Health (ARPA-H) supports transformative research to drive biomedical and health breakthroughs ranging from molecular to societal to provide transformative hea...

Cyberattacks have halted care and disrupted devices across U.S. clinics. In a new piece, @arpa-h.bsky.social explains why it’s funding real-world cyber protections.

Proud our CHC team built CRASHCART, restoring critical systems fast when ransomware hits.

Read⬇️
arpa-h.gov/news-and-eve...

3 weeks ago 3 2 0 0

Statistician: Hey, want a ride?

ML researcher: Sure.

*** hops in EV ***

*** 5 minutes pass **

ML: What are you waiting for?

Stats: Finishing up the power calculation.

4 weeks ago 3 0 0 0
Kermit the frog screaming with excitement

Kermit the frog screaming with excitement

We have summer internships y'all! Come work at Posit on the PyData, tidymodels, shiny, or Connect teams: grnh.se/tigz810a3us. You will have an awesome time, learn a ton, and help advance our open source and pro tools 🧰 #rstats #pydata

1 month ago 71 46 2 0

Differences in Differences No Thanks (DIDNT)

4 weeks ago 21 2 1 0
Preview
Bayesian Neural Networks in {tidymodels} with {kindling} Showcasing the versatility of the `{kindling}` R package.

Neural networks that know what they don't know? 🤔

New post with Joshua Marie: fit Bayesian Neural Networks in R with {kindling} + {tidymodels} and get uncertainty estimates out of a familiar workflow.

👉 statsandr.com/blog/bayesian-neural-networks-in-tidymodels-with-kindling/

#rstats #tidymodels

1 month ago 6 1 0 0
Advertisement
Preview
Building realistic fake datasets with Pointblank - Posit With Pointblank, you can generate realistic, synthetic datasets by defining schemas with built-in coherence for geographic, personal, and business data.

New in pointblank: Generate realistic, coherent synthetic datasets directly in #Python. 🛠️📊

Need test data that actually makes sense? Pointblank’s generate_dataset() ensures that names, emails, and locations stay consistent across rows, covering 100+ countries!

Learn more: posit.co/blog/buildin...

1 month ago 10 1 0 0

Especially if it’s a MacBook Pro.

1 month ago 0 0 0 0

According to my pooled analysis, this joke is average [95% CI meh, 🤣]

1 month ago 4 1 0 0