🏎️ Dataset: huggingface.co/datasets/Mik...
📝 Cohere's model: cohere.com/transcribe
Posts by Michele Ciletti
The dataset is available on Hugging Face. Feel free to take a look if you're testing audio models or if you're just interested in studying how people communicate under pressure.
(Or if you just want to see who talks the most on the radio - the data says Lewis and Max, unsurprisingly).
I've been wanting to test out Cohere Labs' new transcription model, and I figured F1 team radios would be a challenging benchmark: drivers speak with accents, under great physical stress, over loud background noise, all while using very niche technical terms. The model held up surprisingly well!
Fun weekend experiment: I built a dataset of ~15,000 Formula 1 team radio messages, with audio recordings and automatic transcriptions! 🏎️
(thread below)
huggingface.co/datasets/Mik...
Our paper on women's experiences in video games got posted to a GamerGate subreddit. The comments proved our point better than our data did. [thread]
Check out an amazing new article by @nadiadileo.bsky.social on postdigital feminism, women's representation in video games, and intersectional experiences in gaming communities. Out now in Postdigital Science and Education! link.springer.com/article/10.1...
[...] for perception verbs in Latin and Ancient Greek, while also testing these methods on Italian.
The project will deliver new open data and a training resource on how to integrate LLMs in linguistic research. Many thanks to H2IOSC and its funders for supporting this work!
I'm glad to share that I have been awarded an H2IOSC Transnational Mobility Grant for a research stay at King’s College London! @kingsdh.bsky.social
During the visit, I will work with Barbara McGillivray and Andrea Farina to develop a semi-automatic semantic annotation pipeline [...]
Excited to speak at "AI and Digital Humanities: Methodological Approaches, Theories and Methods" at the Univ. of Siena on Nov. 5!
I’ll discuss how historical linguistics can partner with AI, presenting a study on Latin prosody. Check out the program and join us online: www.unisi.it/unisilife/ev...
This week I had a lot of fun in Cagliari at #clicit2025!
I presented a paper titled "Veras Audire Et Reddere Voces: A Corpus of Prosodically-Correct Latin Poetic Audio from Large-Language-Model TTS". Check out the pre-proceedings here: clic2025.unica.it/pre-proceedi...
Excited to be in Cagliari for #clicit2025! I'll be presenting a paper on text-to-speech LLMs and Latin prosody - check out the full program!
Logo for the DraCor Summit 2025 taking place in Berlin.
✨ We're kicking off the #DraCorSummit 2025 in Berlin today!
5 days of digital drama analysis, enhancing #DraCor, and celebrating #OpenScience with participants from around the world at @freieuniversitaet.bsky.social.
Programme:
summit.dracor.org
@temporal-communities.de @unipotsdam.bsky.social
Poster, slides and video presentation are all available on Zenodo: doi.org/10.5281/zeno...
Next week, I will be attending #ACL2025 to present a paper on generating prosodically-accurate Latin poetry recordings with LLMs. Pretty excited to be there for the first time! Check it out: aclanthology.org/2025.acl-srw...
I had a lot of fun presenting a paper on tracing temporal changes in communities during transition periods via network analysis - starting from a case study on a WWII-era newspaper - at #DH2025. My slides are available on Zenodo for anyone interested! doi.org/10.5281/zeno...
Many interesting insights and approaches were discussed this afternoon - and a new special collection on language data reuse was announced! Learn more here: openhumanitiesdata.metajnl.com #DH2025
Final day at #DH2025 - do not miss an awesome panel on open language data sharing & reuse! Pass by Auditorium B3 at 14:00 if you are interested.
And now our @anastasiaglawion.bsky.social and Dhara Lechner presenting on editorial shifts in the Grimm brothers‘ tales! Focusing especially on the relation between speech acts and gender 🤩🤩 Such an exciting project!!! #dh2025 @dhssfau.bsky.social
Our first panel focused on looking at time in linguistic data: Michele Ciletti presented the corpus the „Foggia Occupator“ #dh2025
First day in Lisbon at #DH2025! If you're interested in temporal data, pass by room B203 for the awesome mini-conference "The Times They Are A Changin in Digital Humanities"
Then, at the 25th Annual APGRD Symposium (Oxford & Royal Holloway), I gave a speech on “Pollution Networks in Greek Tragedy” during the opening panel on miasma.
Came home with a pile of notes and a longer reading list - it was great to meet such passionate Classics communities!
Just back from a week in the UK where I took part in two awesome Classics events!
First I attended the workshop “Data-Driven Classics: Interdisciplinary Connections through Shared Data” at King's College London.
Just concluded an enriching Youth and Horror: An International Conference where I presented 'Fear Frames: Catharsis through Horror Manga and Anime.' The global reach of this conference provided incredible insights into how different cultures approach horror media and youth engagement!
Starting our sneak peak at the #Miniconference on #TemporalData!
Panel 1: @mikcil.bsky.social
Michele Ciletti uses NER & network analysis to trace shifting narratives and growing ties between the military & locals in The Foggia Occupator (1945–46), a US military newspaper in postwar Italy.
#DH2025
Musisque Deoque is amazing! My favourite feature is that you can export XML files with fully annotated metrical scansions
It was great to take part in #ACH2025 - probably the only DH conference with a dedicated 2D world and pixel avatars! I presented two papers broadly focused on community - one on academic connections on Mastodon, and the other on the newly published Foggia Occupator Dataset.
Thrilled to be in Verona at #AIUCD2025 - I'll be presenting a poster this afternoon on RAG, digital archives and historical newspapers. Come say hi! @aiucd.bsky.social
#AIUCD2025 in numbers
The AIUCD 2025 proceedings are available on Umanistica Digitale: amsacta.unibo.it/id/eprint/83... (more than 660 pp! 😵💫)