🚨New paper:
Current reports on AI audits/evals often omit crucial details, and there are huge disparities between the thoroughness of different reports. Even technically rigorous evals can offer little useful insight if reported selectively or obscurely.
Audit cards can help.
Posts by Anka Reuel ➡️ NeurIPS
A recent Stanford paper reveals that many popular AI benchmarks are fundamentally flawed: They can be outdated, easily gamed, or inaccurate. Stanford HAI Graduate Fellow
@ankareuel.bsky.social talks about how researchers are rethinking AI benchmarks: www.emergingtechbrew.com/stories/2025...
Hey Kabir! A lot of it is applicable for different types of evals, especially when it comes to reporting considerations. Would you mind sharing more infos here or via DM on the hackathon? Sounds like this would be a cool opportunity to extend the BetterBench work!
Submitting a benchmark to
ICML? Check out our NeurIPS Spotlight paper BetterBench! We outline best practices for benchmark design, implementation & reporting to help shift community norms. Be part of the change! 🙌
+ Add your benchmark to our database for visibility: betterbench.stanford.edu
This is such a hard one :D And I think it extends beyond being patient with the students but also being patient with yourself knowing that you won't get everything perfect the first time around (or ever 🥲)
🔄 Sharing is caring! Help us reach as wide of an audience as possible by spreading the word. Your support is key in crafting an insightful, community-driven chapter and help key researchers in the field get their work promoted! Thank you! 🙏#StanfordHAI #AIIndex x/
The AI Index is an initiative by @stanfordhai.bsky.social. The annual report showcases AI research to enable decision-makers to advance AI responsibly. Previous versions have been cited 300+ times; it's been featured in top media outlets like the @nytimes.com & the @financialtimes.com. 4/
Our chapter will cover fairness & non-discrimination, transparency, explainability, data governance & privacy, security, societal impact, and more. Plus, a special subchapter on responsible AI agents! 🤖 3/
We're seeking impactful AI ethics/safety research from 2024/2025 for inclusion in Stanford's 2025 #AIIndex. Submit your papers or nominate others’ work through our Google Form👇
forms.gle/Hgrzvsi9Yb2B... 2/
📢 Excited to share: I'm again leading the efforts for the Responsible AI chapter for Stanford's 2025 AI Index, curated by @stanfordhai.bsky.social. As last year, we're asking you to submit your favorite papers on the topic for consideration (including your own!) 🧵 1/
This is all awesome advice, thank you so much for sharing! This is an in-person course but we’ll make all lectures publicly available.
I‘m teaching my first own course starting next week (Intro to AI Governance at Stanford). Super proud but also nervous 🥹 Any advice from more seasoned instructors? 😬 #AcademicTwitter #AcademicChatter #TeachingTips #AcademicAdvice
The regular reminder of my starter packs full of amazing folks / accounts to follow. I am trying to keep them up to date but let me know if I missed you.
Thank you, Stefanie! ❤️
As one of the vice chairs of the EU GPAI Code of Practice process, I co-wrote the second draft which just went online – feedback is open until mid-January, please let me know your thoughts, especially on the internal governance section!
digital-strategy.ec.europa.eu/en/library/s...
In our latest brief, Stanford scholars present a novel assessment framework for evaluating the quality of AI benchmarks and share best practices for minimum quality assurance. @ankareuel.bsky.social @chansmi.bsky.social @mlamparth.bsky.social hai.stanford.edu/what-makes-g...
Looking forward to your talk! :)
Thanks a ton, Federico! :)
Thanks so much, Lorena!
Thanks so much, Daniel!
Thanks a lot, Stephan 😊
Thank you Karen 🦋
Thanks so much! And yes, very much looking forward to the weekend 😁🫶
Thanks a lot!
In the same boat as @mlamparth.bsky.social, would appreciate if you could add me, too, please! Thanks so much 😊
In the same boat as @mlamparth.bsky.social, would appreciate if you could add me, too, please! Thanks so much 😊
In the same boat as @mlamparth.bsky.social, would appreciate if you could add me, too! Thanks so much 😊
In the same boat as @mlamparth.bsky.social, would appreciate if you could add me, too! Thanks so much 😊
Would appreciate if you could add me to the Responsible AI and the Security starter packs, similar to @mlamparth.bsky.social, I’m moving here from X 😊
In the same boat as @mlamparth.bsky.social, would appreciate if you could add me, too! Thanks so much 😊