Alexander Loth (@alexanderloth) Bsky

Why do large learning rates and unstable training often generalize better? This paper models optimizers as chaotic dynamical systems, introduces “sharpness dimension,” and shows generalization depends on the full Hessian structure, not simple norms.

http://arxiv.org/abs/2604.19740v1

15 hours ago 0 0 0 0

📊 CRED-1 April Update

2,673 domains tracked for credibility scoring
22 domains rescored this month across 3 automated pipeline runs

Dataset: github.com/aloth/cred-1
Paper: arxiv.org/abs/2503.02647

#CredibilityScoring #Misinformation #OpenData

1 day ago 0 0 0 0

Prism introduces the first symbolic superoptimizer for tensor programs. By searching over symbolic families of programs (not just concrete ones), it prunes provably suboptimal designs and achieves up to 4.9× speedups on LLM workloads.

http://arxiv.org/abs/2604.15272v1

3 days ago 1 0 0 0

Truth breaks quietly when synthetic news scales.
An open-source framework generates controlled, reproducible fake and real news across models, languages, and styles so researchers can test how the truth-default erodes.
Want to help expand the models or datasets?

https://arxiv.org/abs/2601.22871

4 days ago 0 0 0 0

Interesting direction. One thing I'd love to see separated in these indices is exposure vs engagement vs trust/uptake — they often move together in headlines, but they matter very differently for interventions.

1 week ago 0 0 0 0

Mindful Body hero image

Mindful Body is now live on the App Store.

Track weight, body fat, muscle mass, visceral fat, BMR, and circumferences with Apple Health, iCloud sync, and Face ID-protected progress photos.

Free lifetime access: GETFIT2026
apps.apple.com/redeem

1 week ago 0 0 0 0

VisionFoundry shows that VLMs’ weak visual perception isn’t inevitable. With task-targeted synthetic data generated from just a task name, models gain +7% MMVP and +10% CV-Bench-3D - without hurting general skills.

http://arxiv.org/abs/2604.09531v1

1 week ago 0 0 0 0

RewardFlow shows you can steer diffusion models at inference time by optimizing what you reward. Using multi-reward Langevin dynamics and a prompt-aware policy, it achieves strong image edits and compositional alignment without retraining.

http://arxiv.org/abs/2604.08536v1

1 week ago 1 0 0 0

Fascinating split — and consistent with what we're seeing in our own expert survey on GenAI and the verification crisis. Consensus on risk, but deep disagreement on who bears responsibility (platforms? developers? regulators?). That gap is where policy falls through the cracks.

1 week ago 0 0 0 0

Collegiata di Santa Maria Maddalena in Atrani, Amalfi Coast. Bell tower and majolica domes against the Tyrrhenian Sea in golden hour light.

Atrani, Amalfi Coast. Population: 800. Tourism: not yet ruined.

2 weeks ago 1 0 0 0

Fully agree — C2PA is the missing link between AI-generated media and accountability. I've been building Origin Lens to help users verify content credentials on images. The challenge is adoption: standards only work when cameras, platforms, and publishers all implement them consistently.

2 weeks ago 0 0 0 0

Can we steer visual representations like we prompt LLMs? This paper shows how to inject text into vision encoders via early fusion, creating steerable features that stay strong for core vision tasks while focusing on any concept you ask for.

http://arxiv.org/abs/2604.02327v1

2 weeks ago 0 1 0 0

We still verify images by squinting at pixels and vibes.
Origin Lens does on-device cryptographic C2PA verification, showing who signed an image and if it was altered.
When trust is math instead of guesswork, which would you rely on?

https://apps.apple.com/us/app/origin-lens/id6756628121

2 weeks ago 0 0 0 0

This is exactly why standards like C2PA matter so much — provenance needs to be cryptographically attached at creation time, not reconstructed after the fact. Once the chain breaks, you're right: it really is just vibes. Working on this exact problem with content authenticity tools.

2 weeks ago 0 0 1 0

2 years studying AI misinformation. Now: what happens when AI agents act autonomously?

Same trust question, harder version. When an agent sends email on your behalf -- how do you verify intent?

New research direction.

alexloth.com/from-misinformation-to-a...

3 weeks ago 0 0 0 0

March update for CRED-1: 2,629 domain rescores across 4 weekly releases, bringing the dataset to 2,673 tracked sources. Notable addition: rt.com. Scores are recalculated weekly against the latest assessments from 9 major fact-checkers. github.com/aloth/cred-1/releases/ta...

3 weeks ago 0 0 0 0

Great framing. C2PA content credentials are underrated infrastructure — the ability to cryptographically trace image/video origin is exactly what media literacy needs to scale beyond manual verification. Been building in this space too (Origin Lens) and adoption still feels like the main hurdle.

3 weeks ago 1 0 0 1

Disinformation isn’t cottage-scale anymore - it’s becoming industrialized.
JudgeGPT studies whether humans can still spot AI-generated news when deception is mass-produced by models.
If we can’t tell real journalism from synthetic text, what happens to public trust?

https://judgegpt.streamlit.app

3 weeks ago 0 0 0 0

Mindful Body is live. My second app after Mindful Coffee.

Body composition tracker for iPhone: weight, body fat, muscle mass, BMR, circumferences. Face ID progress photos. iCloud sync.

Free lifetime access this week:
apps.apple.com/redeem?ctx=offercodes&id=6760477510&code=GETFIT2026

3 weeks ago 0 0 0 0

Most people drink coffee when cortisol is already peaking. Mindful Coffee uses chronobiology to find your optimal caffeine windows - based on cortisol rhythm + HealthKit sleep data.

On-device ML. Privacy first.

https://github.com/aloth/mindful-coffee

#iOS #Coffee #Chronobiology

3 weeks ago 0 0 0 0

New result: Multilevel Euler–Maruyama gives a polynomial speedup for diffusion sampling. By mixing cheap and expensive UNets, you can sample at roughly the cost of a single large model eval, with theory and experiments backing it.

http://arxiv.org/abs/2603.24594v1

3 weeks ago 1 0 0 0

Remember the fake Pentagon explosion photo that briefly moved markets?
An on-device verifier could flag it as AI-made by checking content credentials, signatures, and AI markers before sharing.
Would you trust an image more if you could see its full edit history?

https://arxiv.org/abs/2602.03423

3 weeks ago 0 0 0 0

The labeling part is key — clear disclosure is exactly what C2PA/Content Credentials are designed for. Interesting that the economic case for AI-generated content is murkier than the hype. Working on image provenance tools myself; the trust-but-verify infrastructure has far to go.

3 weeks ago 0 0 0 0

Every image online has a history. Most of the time, you can't see it.

Origin Lens reads C2PA Content Credentials: cryptographic proof of origin, edit history, AI generation detection. Open source (GPL-3.0), runs on-device.

https://apps.apple.com/us/app/origin-lens/id6756628121

4 weeks ago 1 1 0 0

Content Credentials are exactly the kind of infrastructure the information ecosystem needs. I've been building Origin Lens around C2PA — verifying provenance at the point of consumption is what makes this actionable, not just aspirational. Great to see more adoption momentum.

4 weeks ago 0 0 0 0

This paper shows chain-of-thought faithfulness isn’t a single objective number. On the same data, different classifiers shift scores by up to 30 points and even reverse model rankings. Measurement choice matters more than we admit.

http://arxiv.org/abs/2603.20172v1

4 weeks ago 0 0 0 0

New preprint: CRED-1, an open dataset scoring 2,672 domains for credibility. Multi-signal, reproducible, built for on-device pre-bunking of misinformation.

Paper: ssrn.com/abstract=6448466
Data: github.com/aloth/cred-1

#OpenScience #Misinformation #OpenData

1 month ago 0 0 0 0

Nemotron-Cascade 2 shows how far post-training can go: a 30B MoE with only 3B active params reaches Gold-level IMO/IOI/ICPC performance via Cascade RL and multi-domain on-policy distillation.

http://arxiv.org/abs/2603.19220v1

1 month ago 0 1 0 0

Great question to explore. C2PA Content Credentials can establish a verifiable chain of custody for media — but adoption requires both platform integration and user-facing transparency. The normative gap between "signed" and "trustworthy" is where the real research challenge lies.

1 month ago 0 0 0 0

Can you tell AI-written news from human-written? We tested 2,308 news fragments across 37 LLM configs.

Two papers accepted at WWW 2026:
- Industrialized Deception
- Eroding the Truth-Default

https://github.com/aloth/JudgeGPT
https://github.com/aloth/RogueGPT

#AI #Research #WWW26

1 month ago 0 0 0 0

Posts by Alexander Loth