π Key Topics Include:
- Lifecycle Uses & LLM-Driven Generation
- Safety & Robustness
- Privacy, Security & Data Governance
- Fairness, Bias & Representation
- Explainability, Interpretability & Uncertainty
- Standards, Metrics & Tooling for Trustworthy Use
- Critical Perspectives on Synthetic Data
Posts by Shlomi Hod
Foundation models increasingly leverage synthetic data for training while simultaneously generating synthetic datasets for downstream applications.
This workshop centers on the responsible development and use of synthetic data with and for foundation models
ποΈ Submission Deadline: October 20th, 2025 AoE
Shaping Responsible Synthetic Data in the Era of Foundation Models A Workshop at AAAI 2026 27th January 2026 Singapore Expo, Singapore
π¨ Submission deadline is approaching for the Responsible Synthetic Data (RSD) Workshop @ AAAI 2026
π’ The RSD workshop at AAAI 2026 (27th January, πΈπ¬ Singapore) focuses on responsible practices for synthetic data with/for foundation models.
π Website: responsible-synthetic-data.github.io
The FORC 2026 call for papers is out! responsiblecomputing.org/forc-2026-ca... Two reviewing cycles with two deadlines: Nov 11 and Feb 17. If you haven't been, FORC is a great venue for theoretical work in "responsible AI" --- fairness, privacy, social choice, CS&Law, explainability, etc.
(Caught this via Terms Watch - a tool I built that monitors ToS changes across major platforms. Daily digest at termswatch.io)
This is a fascinating shift in platform power dynamics. Instead of unilateral AI scraping, we're seeing the a forced data reciprocity
Some open source devs may want LLMs to learn their code - it could help users get support.
But this reciprocity clause might have a chilling effect
Practical example: Google trains Gemini on GitHub code (everyone does - it's the world's largest code repo).
Under the new terms, GitHub could now access Google's public data - like YouTube videos - for their own AI training
GitHub (owned by Microsoft) just added an "Access Reciprocity" clause to their Terms of Service.
If you're a tech giant (700M+ monthly users) training AI on GitHub's code, GitHub now gets to use YOUR public data too.
Here's what that means π§΅