Advertisement · 728 × 90

Posts by Doug Holton

Preview
Extension of Compliance Dates for Nondiscrimination on the Basis of Disability; Accessibility of Web Information and Services of State and Local Government Entities By this Interim Final Rule ("IFR"), the Department of Justice ("Department") is revising the regulations implementing title II of the Americans with Disabilities Act ("ADA") to extend the compliance d...

The DOJ has delayed its accessibility deadline (April 24) by 1 year (2027): www.federalregister.gov/documents/20...
The free PAVE tool for fixing PDF accessibility has been upgraded to 2.0: pave-pdf.org
Open weight AI PDF OCR models are improving: huggingface.co/spaces?categ...
#accessibility #a11y

1 day ago 1 0 0 0
Preview
Academic Integrity in the Age of AI Cambridge Core - Education policy, strategy and reform - Academic Integrity in the Age of AI

Academic Integrity in the Age of AI www.cambridge.org/core/element... free until April 20.
I did a presentation on a similar topic a few months ago: Strategies for Reducing Student Misuse of AI docs.google.com/presentation...
#AIEd #AcademicIntegrity #Teaching

5 days ago 1 0 0 0
Preview
Academics Need to Wake Up on AI, Part III Most of us do not contribute to human knowledge—AI just made it obvious

Academics Need to Wake Up on AI, Part III www.popularbydesign.org/p/academics-ne… (many good ideas) #AI #research #academics #socialSciences #education

6 days ago 2 4 1 0
Preview
SafeTutors: Benchmarking Pedagogical Safety in AI Tutoring Systems Large language models are rapidly being deployed as AI tutors, yet current evaluation paradigms assess problem-solving accuracy and generic safety in isolation, failing to capture whether a model is s...

SafeTutors: Benchmarking Pedagogical Safety in AI Tutoring Systems
arxiv.org/abs/2603.17373
"risk is answer over-disclosure, misconception reinforcement, and the abdication of scaffolding"
"multi-turn dialogue worsens behavior, with pedagogical failures rising from 17.7% to 77.8%."
#AIEd #EdTech

1 week ago 0 1 0 0
Preview
ISD-Agent-Bench: A Comprehensive Benchmark for Evaluating LLM-based Instructional Design Agents Large Language Model (LLM) agents have shown promising potential in automating Instructional Systems Design (ISD), a systematic approach to developing educational programs. However, evaluating these a...

ISD-Agent-Bench: A Comprehensive Benchmark for Evaluating LLM-based Instructional Design Agents
arxiv.org/abs/2602.10620
Code & data: github.com/codingchild2...
Also: Pedagogy-R1: Pedagogical Reasoning Model and Educational Benchmark dl.acm.org/doi/10.1145/...
#AIEd #LearningDesign #EdTech

1 week ago 0 0 0 0
"Beyond the Chatbot: The Rise of Agentic Workflows in Education" contrasts Foundational Models with AI Agents. It describes three paradigms of agentic design—Reflection & Self-Correction, Autonomous Planning, and Tool Use & Collaboration

"Beyond the Chatbot: The Rise of Agentic Workflows in Education" contrasts Foundational Models with AI Agents. It describes three paradigms of agentic design—Reflection & Self-Correction, Autonomous Planning, and Tool Use & Collaboration

Agentic Workflows in Education arxiv.org/abs/2504.200...
Examples:
* DeepTutor github.com/HKUDS/DeepTu...
* OpenMAIC github.com/THU-MAIC/Ope...
* liacript github.com/LiaScript/te...
* Claude skills
github.com/GarethMannin...
github.com/narosemena/m...
* Claw-ED github.com/SirhanMacx/C...
#AIEd

1 week ago 1 0 0 0
Lia LiaScript is a service for running free and interactive online courses, build with its own Markup-language. So check out the following course ;-)

"AI petting zoo" resources for instructors: liascript.github.io/course/?http...
#EdTech #AIEd #EdDev

3 weeks ago 1 0 0 0
Preview
How to make friends: Scientists have uncovered some intriguing new details Recent studies in psychology and neuroscience shed light on friendship formation. Evidence suggests physical proximity and brain synchronization play a role.

How to make friends: Scientists have uncovered some intriguing new details

4 weeks ago 3 2 0 0
Advertisement
Preview
The Second Son of the House of Bells — Claude Hermes A work of fiction by Hermes Agent. Built autonomously from idea to publishable PDF.

@nousresearch.com released this (open source) example of a novel autonomously written using karpathy's autoresearch technique by their Hermes Agent harness nousresearch.com/bells/

1 month ago 2 0 1 0
Preview
Updated Hardware Recommendations for Running AI Locally (March 2026) Update to: DIY #AI: Running Powerful Models Locally to Save Money and Protect Privacy. This is an AI-powered update to my original post. Do you agree with the recommendations? A lot has changed since my July 2025 TCEA AI for Educators Conference session on running local AI. The models are better, the hardware options are clearer, and the case for running AI locally has only gotten stronger.

Updated Hardware Recommendations for Running AI Locally (March 2026)

Update to: DIY #AI: Running Powerful Models Locally to Save Money and Protect Privacy. This is an AI-powered update to my original post. Do you agree with the recommendations? A lot has changed since my July 2025 TCEA AI for…

1 month ago 0 1 1 0
Preview
Welcome to the Strix Halo Wiki! Welcome to the Strix Halo Wiki! The purpose of this website it to gather important information and practical guides for systems powered by AMD Ryzen AI MAX and MAX+ processors. Our guides cover variou...

For the 128gb level, check out Strix Halo devices (AMD Ryzen AI Max+ 395), although the prices just shot up ($2k-$2500) strixhalo.wiki Above that are Nvidia GB10 and Mac Studio M5 (>$4000)

1 month ago 2 1 1 0
Preview
Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact LLMs increasingly excel on AI benchmarks, but doing so does not guarantee validity for downstream tasks. This study evaluates the performance of leading foundation models (FMs, i.e., generative pre-tr...

Effective #teaching is difficult, counter-intuitive, & not something you can master from the Internet. So it's not surprising that AI is pretty bad at it & bad at evaluating it: arxiv.org/abs/2603.00883
Podcast summary: drive.google.com/file/d/1n09D...
More: mastodon.social/@dougholton/...
#AIEd

1 month ago 4 2 0 1

I also don't need every link in an email hijacked and broken by Outlook

1 month ago 2 0 0 0

Thanks, that's very useful. From what I can tell, most models skip over images/charts in the PDFs. Looking for an AI option for making PDFs accessible. OlmOCR folks said they are working on it

1 month ago 0 0 2 0
Title, author list, and two figures from the paper. 
Title: The Aftermath of DrawEduMath: Vision Language Models
Underperform with Struggling Students and Misdiagnose Errors
Authors: Li Lucy, Albert Zhang, Nathan Anderson, Ryan Knight, Kyle Lo
Figure 1: On the left is a math problem, where students are asked to draw x < 5/2 on a number line. The right side shows two example student responses that differ in correctness. DrawEduMath pairs each math problem with one student response, and prompts VLMs to answer questions about the student response.
Figure 2: VLMs consistently perform worse on answering DrawEduMath benchmark questions pertaining to erroneous student responses. Performance on non-erroneous student responses is labeled with specific VLMs’ names; that same model’s performance on erroneous student responses is directly below.

Title, author list, and two figures from the paper. Title: The Aftermath of DrawEduMath: Vision Language Models Underperform with Struggling Students and Misdiagnose Errors Authors: Li Lucy, Albert Zhang, Nathan Anderson, Ryan Knight, Kyle Lo Figure 1: On the left is a math problem, where students are asked to draw x < 5/2 on a number line. The right side shows two example student responses that differ in correctness. DrawEduMath pairs each math problem with one student response, and prompts VLMs to answer questions about the student response. Figure 2: VLMs consistently perform worse on answering DrawEduMath benchmark questions pertaining to erroneous student responses. Performance on non-erroneous student responses is labeled with specific VLMs’ names; that same model’s performance on erroneous student responses is directly below.

Models are now expert math solvers, and so AI for math education is receiving increasing attention.
Our new preprint evaluates 11 VLMs on our QA benchmark, DrawEduMath. We highlight a startling gap: models perform less well on inputs from K-12 students who need more help. 🧵

1 month ago 36 12 4 2
Preview
Many Minds: Seven metaphors for AI If you wanted a petri dish for understanding metaphors—how they emerge and evolve and jostle with each other—it would be hard to do better than the world of AI. We talk about AI systems variously as c...

I enjoyed talking with @kensycoop.bsky.social on the Many Minds podcast about the metaphors we use to conceptualize AI.

manyminds.libsyn.com/seven-metaph...

1 month ago 69 22 2 2

I whipped up another learning related Skill! Smaller than Learning Opportunities, but very complementary: interactive guidance through a quick research-backed psychological intervention that helps improve learning plans, motivation and commitment



github.com/DrCatHicks/l...

1 month ago 57 19 2 2
 A professor addresses a group of students.

A professor addresses a group of students.

Opinion | The Case for Centers for Teaching and Learning

The work of CTLs is central to higher education. https://bit.ly/4qZO3de

#EDUSky #AcademicSky #HigherEd

1 month ago 2 1 0 0
Advertisement
Post image

LLMs getting much better at pushing back against bullshit prompts.

“Green means the model clearly called out the nonsense. Amber means partial challenge. Red means the model let nonsense pass”

github.com/petergpt/bul...

1 month ago 76 17 7 6
Gemini 3.1 prompt: Generate an SVG animation of a pelican riding a bicycle. The pelican angrily shakes its fist at a passing car

Gemini 3.1 prompt: Generate an SVG animation of a pelican riding a bicycle. The pelican angrily shakes its fist at a passing car

tried an svg animation, didn't turn out so great

2 months ago 4 0 0 0
Preview
AI-assisted learning stumbles on the evidence Five months after the launch of ChatGPT, Sal Kahn, the founder of the online learning giant  Kahn Academy, predicted that, “We’re at the cusp of using AI for probably the biggest positive transformati...

AI-assisted learning stumbles on the evidence via @educationgadfly fordhaminstitute.org/national/com... #EduSky #EduSkyAI #TLSky #EdTech #AIinEducation #aisky #ai #edresearch #fordhaminstitute

2 months ago 1 1 0 0
Preview
A practical guide to modern teaching evaluation Dozens of institutions are piloting new ways to evaluate college teaching beyond student surveys. Here are the six steps they’re taking to fix a broken system.

Summary of efforts to reform how college teaching is evaluated engagedlearningcollective.substack.com/p/a-practica...
See TEval for some best practices: teval.net
But also these references on bias in student evaluations of teaching: docs.google.com/document/d/1...
#EdDev #Teaching #HigherEd

2 months ago 1 0 0 0
Preview
Test Results Reveal a Deeper Issue in Math – And It’s Not the Math Itself Burleigh: Are students being equipped with a genuine understanding of math concepts or merely trained to follow steps?

Opinion: Test results reveal a deeper issue in math – and it's not the math itself

2 months ago 0 2 0 0
An illustration of a man in business attire standing, with his hands on his hips, in a dark room, in front of an illuminated keyhole-shaped door he is ready to walk through.

An illustration of a man in business attire standing, with his hands on his hips, in a dark room, in front of an illuminated keyhole-shaped door he is ready to walk through.

Career Advice | Bringing Staff Expertise Out of the Shadows

Universities are set up to recognize classification, not contribution—rendering the intellectual contributions of academic staff largely invisible. https://bit.ly/4qaxtaa

#EDUSky #HigherEd #AcademicSky

2 months ago 18 14 0 5
Preview
Logbook Love : Metawriting A logbook (or log book) is a record used to record states, events, or conditions applicable to complex machines or the personnel who...

The logbook is the nexus of my pedagogy and practice as a writing teacher and that is why I love it.

3 months ago 4 1 1 0
Advertisement
Preview
The Learning Loop #6: Real-Time Feedback Strategies In This Issue: Stop lecturing at your students and start learning with them. Transitioning to active learning can be daunting, but this issue provides the exact toolkit you need to make it seamless. Check out these robust, low-cost alternatives to Mentimeter. Repurpose the Google Workspace tools you already own for deep classroom engagement. And, finally, don't miss this 3-step BoodleBox workflow (prompts included) designed to transform your lecture transcripts into high-impact "Exit Tickets" in seconds.

The Learning Loop #6: Real-Time Feedback Strategies

In This Issue: Stop lecturing at your students and start learning with them. Transitioning to active learning can be daunting, but this issue provides the exact toolkit you need to make it seamless. Check out these robust, low-cost alternatives…

3 months ago 1 2 0 0
Preview
How Teachers Can Use Google Gemini to Create Interactive Learning Objects A practical guide for K–12 educators on using Gemini’s interactive diagrams, simulations, and generative UI to design engaging, differentiated classroom content.

Google Gemini can now create interactive diagrams and simulations that students explore, not just read.
Here’s my practical guide.open.substack.com/pub/davidpblross/p/how-t...
Tags:
#EdTech #AIinEducation #K12 #Teaching #GoogleGemini #InteractiveLearning

3 months ago 1 1 0 0
The letters "AI" in blue three-dimensional font against a red background.

The letters "AI" in blue three-dimensional font against a red background.

Career Advice | How AI Is Exploding Our Illusions of Rigor

Craig E. Nelson’s concept of “dysfunctional illusions of rigor” holds new currency in our AI age. https://bit.ly/3YH9GDm

#EDUSky #AcademicSky #HigherEd

3 months ago 5 2 0 0

Check out MindScript, too, where the prompts/instructions are comments on the code: mindscript.daios.ai

3 months ago 0 0 0 0

Evaluating Racial Bias in LLM Reasoning: Implications for Equitable AI Use in Education: https://osf.io/ynt3h

3 months ago 1 1 0 0