Made some LatinCy-based suggestions here—what else are people working with?
Posts by Patrick J. Burns
Cool—follow up with the results here, looking forward to seeing what you find. Hope to do a good amount of multilingual in the coming year (though tbh Latin~Greek to start)
Also, latincy-readers can now build a sentence-level vector index for collections—a newer feature and one that could use more real-world testing. But if your texts are plaintext or TEI or another common format, it shouldn’t be too hard to set up. Some examples here: github.com//latincy-rea...
Maybe more immediately useful—you can get contextual token vectors and span vectors (so that includes sents) using the LatinCy pipelines on your texts, cf. this chapter and the following latincy.github.io/latincy-book...
What is the application? There are w2v and floret models in the LatinCy HF repo and I have other static models(ft, glove) that could be published if helpful. Refactored LatinBERT is there too; cf. huggingface.co/latincy/mode...
LatinCy Lexicon logo
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
From Whitaker's release notes... "Permission is hereby freely given for any and all use of program and data." Amazingly open license, allowing us to build cool stuff for Latin. And thanks also to Martin Keegan for hosting the maintenance of Words since 2015, cf. mk270.github.io/whitakers-wo...
Also—hoping to implement genuine word sense disambiguation soon by combining this functionality with the LatinCy token vectors... getting there...
Code to generate "reinflected" forms on Latin words in context with LatinCy Lexicon
We can also "reinflect" Latin forms based on existing contextual morph annotations, e.g.
Code to generate all forms of the Latin verb `amo` in the present indicative active from a single form.
Not only can we start using LatinCy annotations to disambiguate words, we can also use the WW word formation logic to generate paradigms for spaCy tokens...
LatinCy Lexicon logo
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
New blog post on the LatinCy v3.9 XPOS tags now up here:
exploratoryphilology.org/posts/latinc...
#digiclass #nlproc
New blog post on the LatinCy v3.9 XPOS tags now up here:
exploratoryphilology.org/posts/latinc...
#digiclass #nlproc
"Album" cover for the LatinCy v3.9 pipelines with "catus" from Gesner's 1551 Historia Animalium
✨ LatinCy v3.9 sm/md/lg/trf pipelines for SpaCy available ✨
- Improved tokenization and u/v norm
- New custom Latin-specific XPOS tags
- Better, more consistent lemma/morph coverage
huggingface.co/latincy/la_c...
#digiclass #nlproc
Making "album" covers for the model releases now...
This one has Catus — from "De cato seu fele" in Gesner's »Historiae animalium liber primus de quadrupedibus viviparis« (1551), p. 318 books.google.com/books?id=89P...
"Album" cover for the LatinCy v3.9 pipelines with "catus" from Gesner's 1551 Historia Animalium
✨ LatinCy v3.9 sm/md/lg/trf pipelines for SpaCy available ✨
- Improved tokenization and u/v norm
- New custom Latin-specific XPOS tags
- Better, more consistent lemma/morph coverage
huggingface.co/latincy/la_c...
#digiclass #nlproc
This has been a desideratum since the day we first published this model -- I'm so happy Patrick figured this out!
So many of you have asked—happy to announce...
Latin BERT v1 now available on HuggingFace
📦 Model: huggingface.co/latincy/lati...
📝 Preprint: arxiv.org/abs/2009.10053
Original weights, experimental repackaging—leave issues/etc. in the HF discussions #nlproc #digiclass cc: @dbamman.bsky.social
So many of you have asked—happy to announce...
Latin BERT v1 now available on HuggingFace
📦 Model: huggingface.co/latincy/lati...
📝 Preprint: arxiv.org/abs/2009.10053
Original weights, experimental repackaging—leave issues/etc. in the HF discussions #nlproc #digiclass cc: @dbamman.bsky.social
And tomorrow Fri. 3/13—I will give a talk in the late afternoon panel called...
"Is agentic philology an oxymoron? Some thoughts on error, control, and disciplinary definition"
...covering some recent work on Latin post-OCR correction and its implications for philology #nlproc #digiclass
Program for AI & the Study of Antiquity at Rutgers March 12 & 13, 2026
Attending the »AI & the Study of Antiquity« conference @ Rutgers University today and tomorrow...
We have a new article with Digital Classicist Online: Towards a smart edition of Apollodorus
Here is a report of some exploratory work transforming a traditional Greek textbook (Crosby and Schaeffer) into something truly digital. sites.tufts.edu/perseusupdat...
Just announced: Dickinson Summer Latin Workshop: Apuleius, Apologia (Pro se de magia). July 13–18, 2026 blogs.dickinson.edu/dcc/2026/01/...
Side-by-side conllu file in plaintext vs. formatted preview on an example sentence
Wanted an easier way to preview CONLLU files in vscode, couldn't find one, worked up my own...
There is still time to apply for the DCC Summer Online Internship program. Deadline is March 15. Please share with students you think might be interested!
blogs.dickinson.edu/dcc/2026/01/...
Building out this kind of ad-hoc tooling—relatively quickly!—is a clear codemodel pro. And this isn't even ad-hoc anymore, already part of my daily workflow
Would love to get some feedback on this—still, not on the marketplace yet, definitely beta, build-from-source, use-at-your-own-risk, etc. etc. github.com/diyclassics/...
Side-by-side conllu file in plaintext vs. formatted preview on an example sentence
Wanted an easier way to preview CONLLU files in vscode, couldn't find one, worked up my own...