Craig Messner (@cmessner) Bsky

Pretraining Language Models for Diachronic Linguistic Change Discovery Large language models (LLMs) have shown potential as tools for scientific discovery. This has engendered growing interest in their use in humanistic disciplines, such as historical linguistics and lit...

“Pretraining Language Models for Diachronic Linguistic Change Discovery” by @tom-lippincott.bsky.social, @cmessner.bsky.social, & more shows that efficient pretraining techniques produce useful models over corpora too large for easy manual inspection and too small for “typical” LLM approaches: (3/5)

4 weeks ago 2 1 1 0

Byte Magazine Volume 09 Number 12 - New Chips : Free Download, Borrow, and Streaming : Internet Archive

Maybe a bit afield, but I always like telling students about Hugh Kenner's experiments with ngram modeling for literary style in the early 1980s (so roughly contemporaneous with Jelenik's adoption of them for ASR). As found in Byte magazine in 1984: archive.org/details/byte...

1 month ago 1 0 0 0

this is both predictable and informative

1 month ago 0 0 0 0

I regret to inform you that while LLMs may well automate numerous rote linguistic tasks they cannot as yet replace your teammates in CS2

1 month ago 1 0 1 0

New to me as well, will be using this in class this semester!

2 months ago 1 0 0 0

From my own local extrapolations, it feels like methods tied to classic quant dh/cultural analytics are getting broader play, and that "humanities machine learning" (which extends ML fields like interpretability, data-efficient training, evaluation and etc.) is emerging underneath

2 months ago 1 0 1 0

The real question is: who is winning?

3 months ago 0 0 0 0

Dense SAE Latents Are Features, Not Bugs

Any leads on research that examines what disjoint exists between human-recoverable and SAE recoverable features out there? (Perhaps after: arxiv.org/html/2506.15..., which features are "naturally" distinguishable by humans but represented densely by SAEs, which end up in "noisy" dense reps.)

6 months ago 0 0 0 0

Anyone have on instruction tuning for models trained solely on historical data? Turns out texts from 1750 have very few "reddit-like" constructs.

6 months ago 3 0 0 0

Interestingly this reads to me a lot like a description of how close reading works in practice, especially post new-historicism -- "why this word here, knowing what we know about its contemporaneous use"

6 months ago 1 0 0 0

More importantly, its frustrations will hopefully serve as useful critical lessons. I also hereby claim the use of the name "Poetaster" for any further such systems!

7 months ago 0 0 0 0

GitHub - messner1/poetaster: Demo of a RAG-based educational game on an EmbeddingGemma/quantized Llama backbone Demo of a RAG-based educational game on an EmbeddingGemma/quantized Llama backbone - messner1/poetaster

I teach a machine learning class for students in traditionally less computational fields (cdh.jhu.edu/teaching/4/). Recent students had questions about RAG, so I used EmbeddingGemma's release as an excuse to put together an example in the form of a poetry criticism game

github.com/messner1/poe...

7 months ago 4 0 1 0

Posts by Craig Messner