Kush Varshney कुश वार्ष्णेय (@krvarshney) Bsky

We couldn't have done this without amazing authors. Shai Satran, Will Kidder, Jason D'Cruz, @krvarshney.bsky.social, Sean Laurent, Sooyun Iris Chung, Ariel Goldstein, @gabistanovsky.bsky.social Austin Beattie, @andyhigh.bsky.social @mohammadatari.bsky.social @firatseker.bsky.social Aliah Zewail 5/n

3 months ago 2 1 1 0

IBM Granite is ranked world’s most transparent model The Stanford University Foundation Model Transparency Index has ranked IBM Granite number one this year — with the highest score in the history of the index.

The latest Stanford University Foundation Model Transparency Index was released out today, and IBM took the top spot !

In a year when other major AI players retreated from transparency, we doubled down and received the highest score in the Index’s history:
research.ibm.com/blog/ibm-gra...

4 months ago 2 1 0 0

The Perfect Emptiness of AI We’ve built a technology that speaks like a sage but thinks like a spreadsheet.

"When language no longer requires belief, AI’s fluency becomes a kind of anesthesia. And we are the ones it sedates. I’m reminded of T. S. Eliot’s ghostly image of a “patient etherized upon a table,” alive yet emptied of agency." www.psychologytoday.com/us/blog/the-...

5 months ago 0 0 0 0

The bar chart is titled **“Retrieval Augmented Generation (RAG)”** and shows **MTRAG mean accuracy** on the y-axis (0–80 scale). ### Results by model: * **Granite-4.0-H-Small**: **73** (blue bar, highest) * **Granite-4.0-Micro**: **72** (blue bar, nearly tied with H-Small) * **GPT-OSS-20B**: **68** (green bar) * **Mistral-Small-3.2-Instruct**: **48** (green bar, lowest score) * **Llama-3.2-Instruct**: **53** (green bar) * **Llama-3.3-70B-Instruct**: **61** (green bar) * **Qwen3-8B**: **55** (green bar) ### Key takeaway: The **Granite-4.0 models (H-Small and Micro)** outperform all others, achieving ~73 accuracy, with GPT-OSS-20B in third at 68. The weakest performance is from **Mistral-Small-3.2-Instruct (48)**.

Granite-4.0-H-Small: a 32B-A9B MoE Mamba for high efficency

Damn! IBM is on the map. The American Qwen? I barely even knew IBM made LLMs, this is solid

www.ibm.com/new/announce...

6 months ago 31 2 6 1

Why do AI models need to be safe? YouTube video by IBM Research

Recently got to have a super interesting conversation with the infinitely fascinating @krvarshney.bsky.social about why we need to make AI safe, and the very nature of ethics in a disaggregated digital world. Have a watch !
www.youtube.com/watch?v=g2A7...

6 months ago 3 1 0 0

Debugging LLMs to improve their credibility New tools from IBM Research can help LLM users check AI-generated content for accuracy and relevance and defend against jailbreak attacks.

Check out IBM's latest open source tools for trustworthy AI on GitHub:

In-Context Explainability 360

FactReasoner

Contextual Privacy

Links from here: research.ibm.com/blog/debuggi...

8 months ago 0 0 0 0

Opinion | A.I. Is Shedding Enlightenment Values

"In my own interactions with ChatGPT, it has often responded, with patently insincere flattery: “That’s a great question.” It has never responded: “That’s the wrong question.” It has never challenged my moral convictions or asked me to justify myself."
www.nytimes.com/2025/08/02/o...

8 months ago 2 0 0 0

AI thrives where education has been devalued | The Observer A culture that views knowledge as a means to an end invites the misuse of new technology

"Until we recognise that the debate about AI is not just about what machines can do but also about how humans should value education and knowledge, it will remain mired in confusion." observer.co.uk/news/opinion...

8 months ago 0 0 0 0

AI is not Africa’s savior: Avoiding technosolutionism in digital development | Brookings Chinasa T. Okolo discusses how Africa can ensure AI progress serves the contitnent's broader goals of social and economic empowerment.

"The true measure of progress in AI lies not in the sophistication of algorithms but in whether it genuinely serve the people and communities they seek to empower. Without grounding in human dignity and local contexts, AI risks creating technological subjugation."
www.brookings.edu/articles/ai-...

8 months ago 2 0 0 0

How IBM’s Kush Varshney became an iconic ’test’ photo The IBM Fellow reflects on copyright law, generative AI, and how he became the face of the modern camera man

What do authorship, copyright, and creativity mean in the age of AI? @krvarshney.bsky.social talks to us about it:
research.ibm.com/blog/kush-va...

8 months ago 2 1 0 0

Selective Thinking Is The Skill Every Leader Needs When you observe your mind without being swept away, you take back control from unconscious, emotional thinking—the kind that fuels rash decisions and poor leadership.

"Training yourself to observe and challenge these automatic thoughts—what psychologists call metacognition—is strikingly similar to the Buddhist concept of yoniso manasikāra, or wise attention." www.forbes.com/councils/for...

9 months ago 0 0 0 0

AI: Rewriting the future of finance and financial inclusion A new AI-driven framework that is grounded in the distinct needs of the underserved is creating a blueprint for the future of finance around the world.

"The next decade will be shaped by innovators using AI to solve real problems in real communities. The future won’t be written in Silicon Valley, but in Lagos, Jakarta, Cairo and Dubai. AI-powered solutions fused with local knowledge will unlock this future." www.weforum.org/stories/2025...

9 months ago 2 1 0 0

Weike Zhao, Chaoyi Wu, Yanjie Fan, Xiaoman Zhang, Pengcheng Qiu, Yuze Sun, Xiao Zhou, Yanfeng Wang, Ya Zhang, Yongguo Yu, Kun Sun, Weidi Xie
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
https://arxiv.org/abs/2506.20430

9 months ago 0 1 0 0

LLM-as-a-Judge Without the Headaches: EvalAssist Brings Structure and Simplicity to the Chaos of LLM Output Review | AI Alliance Evaluating AI model outputs at scale is a major challenge for teams using LLMs, especially when assessing nuanced qualities like politeness, fairness, and tone that traditional benchmarks miss. IBM Re...

📣 Today we open-sourced EvalAssist, a web-based tool that makes it super easy to develop criteria for llm judges. You can run this now locally and then scale up with notebooks using Unitxt. Check out the AI Alliance article to get the scoop:
thealliance.ai/blog/llm-as-...

10 months ago 5 3 1 1

EvalAssist EvalAssist simplifies LLM-as-a-Judge by supporting users in iteratively refining evaluation criteria in a web-based user experience.

LLM-as-a-Judge Simplified — Start Small, Refine Fast, Scale Smart ibm.github.io/eval-assist/

10 months ago 0 0 0 0

🚨 Announcing our #keynote speakers for the 3rd Trustworthy AI #Workshop @deeplearningindaba.bsky.social ! We are excited to welcome thought leaders pushing the boundaries of #ResponsibleAI

@krvarshney.bsky.social is a Fellow IBM Research

10 months ago 3 2 1 0

Djallel Bouneffouf, Matthew Riemer, Kush Varshney: The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships? https://arxiv.org/abs/2506.01813 https://arxiv.org/pdf/2506.01813 https://arxiv.org/html/2506.01813

10 months ago 1 1 1 0

Announcing our keynote speakers for #FAccT2025! 🎉

Suresh Venkatasubramanian (Brown)
Nathalie Smuha (KU Leuven)
Kristian Lum (Google DeepMind)
Molly Crockett (Princeton)

And the plenary panel will be on “Pathways of Change and the Future of Responsible AI"

11 months ago 25 8 0 0

Frying gulab jamuns helps you understand the phenomenon of tidal locking between moons and planets.

11 months ago 0 0 0 0

I tried getting LLMs to work together using ACP (Agent Communication Protocol) YouTube video by Nicholas Renotte

🔗 Want to connect your agents together wherever they are🌎?

See what's possible with ACP! This video will show:
🎁 How to wrap an agent with the SDK
🔈 Calling out with a a standardized client
⛓️Chaining ACP calls to different agents
📲 Prototype of ACPCallingAgent

👉 www.youtube.com/watch?v=Nzaq...

11 months ago 2 2 0 0

The Strange Physics That Gave Birth to AI | Quanta Magazine Modern thinking machines owe their existence to insights from the physics of complex materials.

Happy to see @bhoov.bsky.social recognized in this article about spin glasses and associative memory.
www.quantamagazine.org/the-strange-...

11 months ago 2 0 0 0

AI Attribution Toolkit An attribution statement identifies not only the presence of AI involvement, but also how AI was used. This approach makes important distinctions between different types and amounts of AI…

🤖 ✏️ There is a better way to explain how you used AI in your {research paper, college essay, blog posts, …}. Check out our new AI Attribution Toolkit and look for us at #CHI2025!

aiattribution.github.io
dl.acm.org/doi/full/10....

11 months ago 5 2 0 1

How AI governance wouldn’t exist without our maritime past IBM’s Kush Varshney explains the origins of the phrase ’AI governance’ and how IBM is adapting its Trust 360 toolkits for the age of generative AI.

I appreciated the framing in terms of governors (research.ibm.com/blog/AI-gove...) and the discussion of many strategies for pursuing safety (doi.org/10.1089/big....). Now that we're moving to agentic AI, I think systems theories will be even more important for control (arxiv.org/abs/2503.00237).

11 months ago 2 0 1 0

1 year ago 0 0 0 0

Training LLMs to self-detoxify their language A new method called self-disciplined autoregressive sampling (SASA) allows large language models to detoxify their own outputs, without sacrificing fluency.

“If we think about how human beings in the world, we do see bad things, so it’s not about allowing the language model to see only the good things. It’s about understanding the full spectrum — both good and bad,” says Ko, “and choosing to uphold our values when we speak.”
news.mit.edu/2025/trainin...

1 year ago 0 0 0 0

Toward a Systems Theory for Human-Centered Trustworthy Agentic AI Come join us as we explore creating AI systems that prioritize human trust and agency in a fun and interactive event!

See you at the University of Sydney in less than two hours. www.eventbrite.com.au/e/toward-a-s...

1 year ago 1 0 0 0

Granite Guardian tops third-party AI benchmark IBM’s collection of LLM guardrail models take six of the top 10 spots on the new GuardBench leaderboard.

Granite Guardian tops a new benchmark! research.ibm.com/blog/granite...

1 year ago 3 2 0 0

Decolonial AI Alignment by Kush Varshney (IBM Research, US)

LLMs need not engage in a coloniality of knowledge by treating one culture's ethics or moral philosophy as universally correct. Instead, open LLMs should be aligned to value systems from different epistomologies and not assume universal values. 🌍🤖 #ai #hcai #alignment

1 year ago 6 2 0 1

Posts by Kush Varshney कुश वार्ष्णेय