one arm bandit curious? Never heard of the Zollman Effect? Join us Friday next week for the new episode of Conversations at the Center the podcast of the @center4philsci.bsky.social with our guest @kevinzollman.com
Posts by Max Noichl
1/
"Silicon samples" are becoming more and more common in research and polling.
One problem: depending on the analytic decisions made, you can basically get these samples to show any effect you want.
The updated version of this preprint is now online!
THREAD🧵
arxiv.org/abs/2509.13397
This is a bit niche, but for those interested in metaphor and metonymy research, here is one of the first articles I have seen using LLMs as research tool! #cogling #metaphor #metonymy arxiv.org/abs/2604.12919 Oh, they have also done something on visual metonym arxiv.org/abs/2601.17706
The Library of Virginia in Richmond seeks a data engineer ($100k-$125k) to transform data practices at a 200-year-old cultural heritage org with an eye towards the future.
Looking for someone to imagine & collaboratively implement tomorrow's data infrastructure.
Apply by May 1! Tell your friends!
How can generative AI better support human creativity, without limiting it? If you have thoughts, we invite submissions to our ICML workshop on Generative AI, Creativity, and Human-AI Co-Creation
📍 July 2026, Seoul
📄 Submit by: April 24 (AOE)
🔗 Submission link: openreview.net/group?id=ICM...
I made some playable philosophy simulations:
-Oxford 1952
-Republic Book I
-Jena 1799
-Paris 1945
www.ux-phi.com
Screenshot of plot showing ELO vs paramter count for different OCR models
There is no best VLM OCR model - rankings can flip completely by document type.
I built ocr-bench: run open OCR models on YOUR documents, get a per-collection leaderboard.
VLM-as-judge with Bradley-Terry ELO, all running on @hf.co. No local GPU needed.
i'm trying out the novel writing project with Claude in Claude Code, using Pangram to break it out of writing in a clearly identifiable AI-writing style. it's going... interesting so far. i despaired at the beginning but am now cautiously optimistic. not so much at the structural level though.
this is cool
tbh all i want is an LLM that sits atop my Zotero library and lets me talk to it tho
Final CFA for the 8th Scientific Understanding and Representation (SURe) annual workshop, which will take place May 27-29, 2026, at the IFIS PAN in Warsaw.
Submission deadline: 20 January 2026.
More info: shorturl.at/AUoye
@philsci.bsky.social @eenphilsci.bsky.social @epsaphilsci.bsky.social
A four-panel figure showing the probability of predicting articles from The Journal of Philosophy versus PMLA using quarter-century models. Each panel represents a different training period (1925-1950, 1950-1975, 1975-2000, 2000-2025). Gray shaded regions indicate training periods. The model trained on early C21 philosophy vs literature cannot accurately distinguish early C20 philosophy vs literature, but the reverse is not true.
Hierarchical cluster of syntactic features predicting philosophy (blue) vs criticism (red).
Top 2 distinctive features for Philosophy vs Criticism.
An example of the importance of the "marker" feature in philosophy.
Analytic philosophy can be distinguished from literary criticism with 90-95% accuracy via syntax alone. Moreover, a classifier trained to separate them in early C20 does better predicting future separations than a C21 one predicts past ones, suggesting philosophy syntax narrows/specializes in ~C21.
OpenAlex intégré au Web of Science, ou la capture du travail des “commoners” | carnetist.hypotheses.org/2572
Three scatterplots of colorful points. titles = ['Color Space', 'Text Space', 'Image Space'] subtitles = ['Embeddings of color features', 'Text embedding of color names', 'Image embeddings of color swatches']
Three different ways to represent colo(u)r. Work in progress, inspired by an old post by Kat Zhang / The Poet Engineer.
"there is a part of human intelligence which operates in a continuous generalization of the space of words, and other parts entirely which do things which are less well understood" is a perfectly reasonable position which apparently has no adherents
Excited to share my latest publication, "Generative Aesthetics: On formal stuckness in AI verse." It's published in a special issue in the Journal of Cultural Analytics, expertly edited by Tess McNulty and Laura Chapot, on "Computation and Form, Reconsidered."
culturalanalytics.org/article/1448...
Tomorrow we will have a keynote from Charles Pence (UC Louvain).
Thanks to the Dutch Philosophy Research School (OZSW) for supporting this event, and @mnoichl.bsky.social for organizing this with me!
academic presentation in a baroque university environment. A group of researchers are gathered around a conference table
Gregor Betz (KIT) kicking off our "Data Driven Philosophy" Hackathon in Utrecht with his talk: "Doing Philosophy with and for LLMs". Besides input about the state of research and new directions, we're spending three days kicking off new projects.
i am going to try to give a framework of my own understanding which laypeople can understand.
Updated & turned my Big LLM Architecture Comparison article into a video lecture.
The 11 LLM archs covered in this video:
1. DeepSeek V3/R1
2. OLMo 2
3. Gemma 3
4. Mistral Small 3.1
5. Llama 4
6. Qwen3
7. SmolLM3
8. Kimi 2
9. GPT-OSS
10. Grok 2.5
11. GLM-4.5/4.6
www.youtube.com/watch?v=rNlU...
For the first episode of Ping Pong Philosophy I had the absolute pleasure to speak with Greg Restall, one of the most renowned philosophical logicians and absolutely great guy to have a chat with. Thank you for your time, Greg, I had a blast.
We are also on Spotify!
Christopher Colón Lugo uses 3D U-net to capture patterns in the Game of Life
#DistributedCiphers
#ALIFE2025
#Postdoc at Technische Universität Berlin in digital humanities & history/philosophy/sociology of science #philsci #STS. ERC project investigates digital communication within the ATLAS collaboration at CERN
Deadline: October 13, 2025
www.jobs.tu-berlin.de/en/job-posti...
#PhilJobs
Upshot:
NNES report to need twice as long to read English-language papers and to prepare English presentations. Even among highly proficient NNES (C1–C2 level), ~60% report having avoided asking questions at events due to concerns about their English (compared to 16% of NES). #philsky
Heat map of St Petersburg
How do literary communities actually form?
@maria-lev.bsky.social analyzes the networks of collaboration and aesthetic affinity that are documented through cultural events — e.g. readings, book launches, festivals. These real-world networks often remain invisible in text-based literary history.
In a new work with Joseph Rich and Conrad Oakes we tackle the problem of how to best organize alluvial plots. We formalize two optimization problems and develop a solution for them based on the neighbornet algorithm, implemented in the program wompwomp: github.com/pachterlab/w...
Had a great time last week at #epsa2025! I've put the poster up here, if anyone wants to take a closer look: maxnoichl.eu/blog/2025/ep...
A Gaussian process showing that the allowed time series are forced to be compatible with data
I’m especially proud of this article I wrote about Gaussian Processes for the Recast blog! 🥳
GPs are super interesting, but it’s not easy to wrap your head around them at first 🤔
This is a medium level (more intuition than math) introduction to GPs for time series.
getrecast.com/gaussian-pro...
The participants of Dagstuhl Seminar 24122 standing on steps outside (from https://www.dagstuhl.de/24122)
Multiple types of embeddings (UMAP, t-SNE, Laplacian Eigenmaps, PHATE, PCA, MDS) of Wikipedia text data labelled by a text summaries generated by an LLM. Methods like UMAP and t-SNE show cluster structure that reflect shared subject matter in text, whiel other methods show more continuous structure.
Multiple embedding methods (PCA, Laplacian Eigenmaps, t-SNE, MDS, PHATE, UMAP) of primate brain organoids at different time periods. Different methods highlight different aspects of development, such as clusters of similar cell types or time courses of cell development.
Multiple embedding methods (PCA, Laplacian Eigenmaps, t-SNE, MDS, PHATE, UMAP) of 1000 Genomes Project genotypes. Different methods reflect different aspects of demographic history of populations.
Last year I met a bunch of great researchers who work with high-dimensional data at a Dagstuhl seminar. This week we put out a preprint about the history and philosophy of low-dimensional embedding methods, their applications, their challenges, and their possible future arxiv.org/abs/2508.15929