Advertisement ยท 728 ร— 90

Posts by Dirk Hovy

Preview
REVAS โ€” AI-Powered Peer Review Feedback for Academics REVAS analyzes the weakness section of your peer review, scoring each paragraph on actionability, helpfulness, grounding, and verifiability.

๐Ÿ—“๏ธ The ARR March review deadline is approaching: April 20 AoE.
Finishing up your review? Run it through REVAS, a peer review assistant that makes your suggestions more actionable, flags unsupported claims, and grounds your feedback in the paper.
๐Ÿ‘‰ revas.mbzuai.ac.ae

5 days ago 3 4 0 1

#MemoryModay #NLProc Uma et al. (2020) highlights 'A Case for Soft Loss Functions' efficacy using soft labels & crowd annotations in AI tasks, outshining top-tier methods.

2 weeks ago 4 3 0 0

To accommodate ACL decisions, we are further extending the commitment deadline for pre-reviewed ARR submissions to April 7!

2 weeks ago 4 4 0 0

The paper acceptance notifications will be out by the 6th of April, AoE. The PCs are working hard throughout the holiday season to finalize the decisions.

Apologies for the delay!

2 weeks ago 4 6 0 0
Preview
2026 Manchester Application and registration

The deadline for submission to the Political Networks conference is this Friday. It's taking place Aug 4-7, in Manchester. sites.google.com/view/confpol...

2 weeks ago 3 2 0 0

#TBT #NLProc '[MASK]? Making Sense of Language-Specific BERT Models' by @deboranozza.bsky.social, Bianchi & @dirkhovy.bsky.social (2020), explores language-specific vs universal BERT models.

2 weeks ago 4 2 0 0

- Optional: question your life choices but show up to do it again the next week anyway

3 weeks ago 4 1 0 0

I realized how much DMing is like being a professor/chairing a committee. You:
- make a brilliant plan for 2+ hours of fun
- prep lots of material
- immediately get derailed by questions/arguments/etc.
- keep it together to make the most of the time together
- end up not using most of the material

3 weeks ago 5 1 1 0
Preview
Hey Siri. Ok Google. Alexa: A topic modeling of user reviews for smart speakers Hanh Nguyen, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#MemoryModay #NLProc 'Hey Siri. Ok Google. Alexa: A topic modeling of user reviews for smart speakers,' by Nguyen & @dirkhovy.bsky.social decodes speaker reviews for user preferences using topic models. Domain knowledge needed for market analysis.

3 weeks ago 4 2 0 0
A slide showing that the posterior is proportional to the likelihood times the prior

A slide showing that the posterior is proportional to the likelihood times the prior

I wrote a blog post on my experience using AI for slide generation

Basic idea: write your lecture notes first, then prompt the LLM to produce corresponding slides in reveal.js (h/t @chenhaotan.bsky.social). I'm picky about my slides but was happy with the results!
alexanderhoyle.com/posts/ai-sli...

3 weeks ago 62 8 4 2
Advertisement
Preview
Identifying Linguistic Areas for Geolocation Tommaso Fornaciari, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#TBT #NLProc Fornaciari, @dirkhovy.bsky.social's 'Identifying Linguistic Areas for Geolocation' explores using social media writing for geolocation via Point-to-City (P2C).

3 weeks ago 3 2 0 0

Wish I could be at @eaclmeeting.bsky.social, but the lab is well represetned. If you are there, come and say hi!

3 weeks ago 2 1 0 0
Preview
Dense Node Representation for Geolocation Tommaso Fornaciari, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#MemoryModay #NLProc 'Dense Node Representation for Geolocation' by Fornaciari & @dirkhovy.bsky.social reveals efficient geolocation methods using node2vec & doc2vec models. Greater network size, less parameters.

4 weeks ago 4 2 0 0
Preview
Geolocation with Attention-Based Multitask Learning Models Tommaso Fornaciari, Dirk Hovy. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 2019.

#TBT #NLProc 'Geolocation with Attention-Based Multitask Learning Models' by Tommaso Fornaciari, @dirkhovy.bsky.social (2019) reveals how online political talks can become one-sided. Breaking out of our bubbles! #SocialMedia

1 month ago 3 2 0 0

Chpater 8: @dirkhovy.bsky.social, M Gerondeau & J Globisz on text data and natural language processing.
A very useful chapter on why text is such a rich source for CSS, and how NLP can help with exploration, prediction, and generation; if used thoughtfully and with clear research goals.

1 month ago 3 1 2 1

Just read this great piece - paulgp.com/2026/03/16/r... by @paulgp.com and it got me thinking.

It feels like there is a lot of moral(?) ambiguity and ambivalence around the use of LLMs for academics.

So far, I've avoided having LLMs do basically any of my research writing ...

1 month ago 6 2 2 0
Preview
The Social and the Neural Network: How to Make Natural Language Processing about People again Dirk Hovy. Proceedings of the Second Workshop on Computational Modeling of Peopleโ€™s Opinions, Personality, and Emotions in Social Media. 2018.

#MemoryModay #NLProc 'Make Natural Language Processing About People Again' by @dirkhovy.bsky.social (2018) uncovers how AI models portray different religions and emotions. #AIEthics

1 month ago 7 5 0 0

Joel Tetreault (not on here) also has a great talk on the topic, with lots of interesting anecdotes

1 month ago 3 2 0 0
Preview
Comparing Bayesian Models of Annotation Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio. Transactions of the Association for Computational Linguistics, Volume 6. 2018.

#MemoryModay #NLProc 'Comparing Bayesian Models of Annotation' by Paun et al. dives into corpus annotation, evaluating six models' predictiveness and accuracy. Essential for navigating annotators and item difficulties.

1 month ago 8 2 0 0
Advertisement

๐Ÿ“ข Call for Abstracts!
Towards a Safer Web for Women (co-located with #WebSci26)
๐Ÿ“ Braunschweig ๐Ÿ‡ฉ๐Ÿ‡ช | 26 May 2026
Theme: Preventive approaches to womenโ€™s online safety
๐Ÿ—“ Deadline: 27 March 2026
๐Ÿ”— forms.gle/tYheEgSwGecf...
๐ŸŒ tsww26.github.io

1 month ago 5 4 0 1
Preview
Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-Task Learning Sotiris Lamprinidis, Daniel Hardt, Dirk Hovy. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

#TBT #NLProc 'Predicting News Headline Popularity' by Lamprinidis, Hardt, @dirkhovy.bsky.social (2018) shows neural networks perform similar to Logistic Regression in prediction.

1 month ago 3 2 0 0

One of my favorite studies of the last few years! Great read (albeit with a side of worrying implications for surveys)

1 month ago 6 2 0 0

One of my favorite interdisciplinary projects (with @questoph.bsky.social). Plus: colorful maps!

1 month ago 3 1 0 0
Preview
Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting Dirk Hovy, Christoph Purschke. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

#TBT #NLProc 'Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting' by @dirkhovy.bsky.social and Christoph Purschke (2018) highlights how social class and background impact technology performance. #TechInclusion

1 month ago 3 3 0 1
Post image

4/7 We argue these aren't separate bugs. They're four facets of the same problem:

๐Ÿ”ด Probabilistic โ€” can't match requested distributions
๐ŸŸ  Semantic โ€” confidence โ‰  correctness
๐Ÿ”ต Distributional โ€” output diversity collapse
๐ŸŸข Metacognitive โ€” can't assess its own competence

1 month ago 2 1 1 0
Post image

1/7 ๐Ÿงต The GPT-4 technical report featured detailed calibration curves.

Since then, not a single major model release has reported calibration. The field quietly stopped measuring whether models know what they don't know.

Our new position paper argues this is a mistake. Here's why.

1 month ago 8 2 1 0
Post image

We were thrilled to host @mtutek.bsky.social at our lab last week.
His talk "From Internals to Integrity: How Insights into Transformer LMs Improve Safety, Interpretability, and Explanation Faithfulness" led to great discussions! ๐Ÿ‘
#Transformers #AISafety #ExplainableAI #MLResearch #NLProc

1 month ago 18 3 0 0
Preview
Call for Virtual Registration Subsidies Official website for the 2026 Conference of the European Chapter of the Association for Computational Linguistics

Call for Virtual Registration Subsidies for #EACL26 ๐ŸŒ

โš ๏ธ Not for paper registrants
๐Ÿ“ Apply by Feb 27, 2026 (AoE)
๐Ÿ“ฉ Decisions by Mar 2, 2026

2026.eacl.org/calls/virtua...

Donโ€™t register before hearing back if you apply!

1 month ago 6 5 1 0
Table titled โ€œTaxonomy for evaluation of AI in mental health applications,โ€ organized into columns for quality criteria (validity and reliability) and real-world use (implementation and maintenance). Rows distinguish support types: assessment, intervention, and information synthesis. Each cell lists detailed evaluation questions, such as construct and criterion validity, consistency across populations and time, feasibility, effectiveness, usability, acceptability, safety, and unintended consequences, providing a structured framework for assessing AI systems in mental health contexts.

Table titled โ€œTaxonomy for evaluation of AI in mental health applications,โ€ organized into columns for quality criteria (validity and reliability) and real-world use (implementation and maintenance). Rows distinguish support types: assessment, intervention, and information synthesis. Each cell lists detailed evaluation questions, such as construct and criterion validity, consistency across populations and time, feasibility, effectiveness, usability, acceptability, safety, and unintended consequences, providing a structured framework for assessing AI systems in mental health contexts.

๐Ÿ”Ž๐Ÿงฉ ๐—•๐—ฒ๐˜†๐—ผ๐—ป๐—ฑ ๐—•๐—ฒ๐—ป๐—ฐ๐—ต๐—บ๐—ฎ๐—ฟ๐—ธ๐˜€: ๐—›๐—ผ๐˜„ ๐˜๐—ผ ๐—˜๐˜ƒ๐—ฎ๐—น๐˜‚๐—ฎ๐˜๐—ฒ ๐— ๐—ฒ๐—ป๐˜๐—ฎ๐—น ๐—›๐—ฒ๐—ฎ๐—น๐˜๐—ต ๐—”๐—œ ๐—ฅ๐—ฒ๐˜€๐—ฝ๐—ผ๐—ป๐˜€๐—ถ๐—ฏ๐—น๐˜†
AI for mental health is a high-stakes area: its evaluation needs to meet the highest expectations.

The new preprint ๐˜™๐˜ฆ๐˜ด๐˜ฑ๐˜ฐ๐˜ฏ๐˜ด๐˜ช๐˜ฃ๐˜ญ๐˜ฆ ๐˜Œ๐˜ท๐˜ข๐˜ญ๐˜ถ๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ฐ๐˜ง ๐˜ˆ๐˜ ๐˜ง๐˜ฐ๐˜ณ ๐˜”๐˜ฆ๐˜ฏ๐˜ต๐˜ข๐˜ญ ๐˜๐˜ฆ๐˜ข๐˜ญ๐˜ต๐˜ฉ, written by an interdisciplinary team spanning AI [...]

2 months ago 3 3 1 0
Advertisement
Post image

Honored to give my first keynote at #IRCDL2026 on February 19th.

Iโ€™ll talk about how LLMs have shifted from productivity tools to everyday sources of info & personal guidance and what that means for risk, trust, bias, and alignment.

ircdl2026.unimore.it

2 months ago 14 2 0 0