Dirk Hovy (@dirkhovy) Bsky

REVAS — AI-Powered Peer Review Feedback for Academics REVAS analyzes the weakness section of your peer review, scoring each paragraph on actionability, helpfulness, grounding, and verifiability.

🗓️ The ARR March review deadline is approaching: April 20 AoE.
Finishing up your review? Run it through REVAS, a peer review assistant that makes your suggestions more actionable, flags unsupported claims, and grounds your feedback in the paper.
👉 revas.mbzuai.ac.ae

5 days ago 3 4 0 1

#MemoryModay #NLProc Uma et al. (2020) highlights 'A Case for Soft Loss Functions' efficacy using soft labels & crowd annotations in AI tasks, outshining top-tier methods.

2 weeks ago 4 3 0 0

To accommodate ACL decisions, we are further extending the commitment deadline for pre-reviewed ARR submissions to April 7!

2 weeks ago 4 4 0 0

The paper acceptance notifications will be out by the 6th of April, AoE. The PCs are working hard throughout the holiday season to finalize the decisions.

Apologies for the delay!

2 weeks ago 4 6 0 0

2026 Manchester Application and registration

The deadline for submission to the Political Networks conference is this Friday. It's taking place Aug 4-7, in Manchester. sites.google.com/view/confpol...

2 weeks ago 3 2 0 0

#TBT #NLProc '[MASK]? Making Sense of Language-Specific BERT Models' by @deboranozza.bsky.social, Bianchi & @dirkhovy.bsky.social (2020), explores language-specific vs universal BERT models.

2 weeks ago 4 2 0 0

- Optional: question your life choices but show up to do it again the next week anyway

3 weeks ago 4 1 0 0

I realized how much DMing is like being a professor/chairing a committee. You:
- make a brilliant plan for 2+ hours of fun
- prep lots of material
- immediately get derailed by questions/arguments/etc.
- keep it together to make the most of the time together
- end up not using most of the material

3 weeks ago 5 1 1 0

Hey Siri. Ok Google. Alexa: A topic modeling of user reviews for smart speakers Hanh Nguyen, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#MemoryModay #NLProc 'Hey Siri. Ok Google. Alexa: A topic modeling of user reviews for smart speakers,' by Nguyen & @dirkhovy.bsky.social decodes speaker reviews for user preferences using topic models. Domain knowledge needed for market analysis.

3 weeks ago 4 2 0 0

A slide showing that the posterior is proportional to the likelihood times the prior

I wrote a blog post on my experience using AI for slide generation

Basic idea: write your lecture notes first, then prompt the LLM to produce corresponding slides in reveal.js (h/t @chenhaotan.bsky.social). I'm picky about my slides but was happy with the results!
alexanderhoyle.com/posts/ai-sli...

3 weeks ago 62 8 4 2

Identifying Linguistic Areas for Geolocation Tommaso Fornaciari, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#TBT #NLProc Fornaciari, @dirkhovy.bsky.social's 'Identifying Linguistic Areas for Geolocation' explores using social media writing for geolocation via Point-to-City (P2C).

3 weeks ago 3 2 0 0

Wish I could be at @eaclmeeting.bsky.social, but the lab is well represetned. If you are there, come and say hi!

3 weeks ago 2 1 0 0

Dense Node Representation for Geolocation Tommaso Fornaciari, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#MemoryModay #NLProc 'Dense Node Representation for Geolocation' by Fornaciari & @dirkhovy.bsky.social reveals efficient geolocation methods using node2vec & doc2vec models. Greater network size, less parameters.

4 weeks ago 4 2 0 0

Geolocation with Attention-Based Multitask Learning Models Tommaso Fornaciari, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#TBT #NLProc 'Geolocation with Attention-Based Multitask Learning Models' by Tommaso Fornaciari, @dirkhovy.bsky.social (2019) reveals how online political talks can become one-sided. Breaking out of our bubbles! #SocialMedia

1 month ago 3 2 0 0

Chpater 8: @dirkhovy.bsky.social, M Gerondeau & J Globisz on text data and natural language processing.
A very useful chapter on why text is such a rich source for CSS, and how NLP can help with exploration, prediction, and generation; if used thoughtfully and with clear research goals.

1 month ago 3 1 2 1

Just read this great piece - paulgp.com/2026/03/16/r... by @paulgp.com and it got me thinking.

It feels like there is a lot of moral(?) ambiguity and ambivalence around the use of LLMs for academics.

So far, I've avoided having LLMs do basically any of my research writing ...

1 month ago 6 2 2 0

The Social and the Neural Network: How to Make Natural Language Processing about People again Dirk Hovy. Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media. 2018.

#MemoryModay #NLProc 'Make Natural Language Processing About People Again' by @dirkhovy.bsky.social (2018) uncovers how AI models portray different religions and emotions. #AIEthics

1 month ago 7 5 0 0

Joel Tetreault (not on here) also has a great talk on the topic, with lots of interesting anecdotes

1 month ago 3 2 0 0

Comparing Bayesian Models of Annotation Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio. Transactions of the Association for Computational Linguistics, Volume 6. 2018.

#MemoryModay #NLProc 'Comparing Bayesian Models of Annotation' by Paun et al. dives into corpus annotation, evaluating six models' predictiveness and accuracy. Essential for navigating annotators and item difficulties.

1 month ago 8 2 0 0

📢 Call for Abstracts!
Towards a Safer Web for Women (co-located with #WebSci26)
📍 Braunschweig 🇩🇪 | 26 May 2026
Theme: Preventive approaches to women’s online safety
🗓 Deadline: 27 March 2026
🔗 forms.gle/tYheEgSwGecf...
🌐 tsww26.github.io

1 month ago 5 4 0 1

Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-Task Learning Sotiris Lamprinidis, Daniel Hardt, Dirk Hovy. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

#TBT #NLProc 'Predicting News Headline Popularity' by Lamprinidis, Hardt, @dirkhovy.bsky.social (2018) shows neural networks perform similar to Logistic Regression in prediction.

1 month ago 3 2 0 0

One of my favorite studies of the last few years! Great read (albeit with a side of worrying implications for surveys)

1 month ago 6 2 0 0

One of my favorite interdisciplinary projects (with @questoph.bsky.social). Plus: colorful maps!

1 month ago 3 1 0 0

Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting Dirk Hovy, Christoph Purschke. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

#TBT #NLProc 'Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting' by @dirkhovy.bsky.social and Christoph Purschke (2018) highlights how social class and background impact technology performance. #TechInclusion

1 month ago 3 3 0 1

4/7 We argue these aren't separate bugs. They're four facets of the same problem:

🔴 Probabilistic — can't match requested distributions
🟠 Semantic — confidence ≠ correctness
🔵 Distributional — output diversity collapse
🟢 Metacognitive — can't assess its own competence

1 month ago 2 1 1 0

1/7 🧵 The GPT-4 technical report featured detailed calibration curves.

Since then, not a single major model release has reported calibration. The field quietly stopped measuring whether models know what they don't know.

Our new position paper argues this is a mistake. Here's why.

1 month ago 8 2 1 0

We were thrilled to host @mtutek.bsky.social at our lab last week.
His talk "From Internals to Integrity: How Insights into Transformer LMs Improve Safety, Interpretability, and Explanation Faithfulness" led to great discussions! 👏
#Transformers #AISafety #ExplainableAI #MLResearch #NLProc

1 month ago 18 3 0 0

Call for Virtual Registration Subsidies Official website for the 2026 Conference of the European Chapter of the Association for Computational Linguistics

Call for Virtual Registration Subsidies for #EACL26 🌍

⚠️ Not for paper registrants
📝 Apply by Feb 27, 2026 (AoE)
📩 Decisions by Mar 2, 2026

2026.eacl.org/calls/virtua...

Don’t register before hearing back if you apply!

1 month ago 6 5 1 0

Table titled “Taxonomy for evaluation of AI in mental health applications,” organized into columns for quality criteria (validity and reliability) and real-world use (implementation and maintenance). Rows distinguish support types: assessment, intervention, and information synthesis. Each cell lists detailed evaluation questions, such as construct and criterion validity, consistency across populations and time, feasibility, effectiveness, usability, acceptability, safety, and unintended consequences, providing a structured framework for assessing AI systems in mental health contexts.

🔎🧩 𝗕𝗲𝘆𝗼𝗻𝗱 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀: 𝗛𝗼𝘄 𝘁𝗼 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗲 𝗠𝗲𝗻𝘁𝗮𝗹 𝗛𝗲𝗮𝗹𝘁𝗵 𝗔𝗜 𝗥𝗲𝘀𝗽𝗼𝗻𝘀𝗶𝗯𝗹𝘆
AI for mental health is a high-stakes area: its evaluation needs to meet the highest expectations.

The new preprint 𝘙𝘦𝘴𝘱𝘰𝘯𝘴𝘪𝘣𝘭𝘦 𝘌𝘷𝘢𝘭𝘶𝘢𝘵𝘪𝘰𝘯 𝘰𝘧 𝘈𝘐 𝘧𝘰𝘳 𝘔𝘦𝘯𝘵𝘢𝘭 𝘏𝘦𝘢𝘭𝘵𝘩, written by an interdisciplinary team spanning AI [...]

2 months ago 3 3 1 0

Honored to give my first keynote at #IRCDL2026 on February 19th.

I’ll talk about how LLMs have shifted from productivity tools to everyday sources of info & personal guidance and what that means for risk, trust, bias, and alignment.

ircdl2026.unimore.it

2 months ago 14 2 0 0

Posts by Dirk Hovy