Advertisement · 728 × 90

Posts by Lingjun Zhao

Senior Tenure Track Faculty at the Artificial Intelligence Interdisciplinary Institute at Maryland (AIM) - Associate Professor/Professor (Open Rank Joint Appointment) Job Description Summary Organization Summary Statement: The Artificial Intelligence Interdisciplinary Institute at Maryland - AIM (aim.umd.edu) - is hiring 40 faculty over the next several years, incl...

AIM's 2nd round of TTK hiring - building up to 30 - is up!

📅 Ddl 12/22/25
🔬 Accessibility & Learning, plus Sustainability & Social Justice
🧑‍🏫 Associate/Full Prof*
🔗 umd.wd1.myworkdayjobs.com/en-US/UMCP/j...

*Assistant-level candidates: apply to departments, mentioning AIM in a cover letter

5 months ago 11 9 0 0
A diagram illustrating pointwise scoring with a large language model (LLM). At the top is a text box containing instructions: 'You will see the text of a political advertisement about a candidate. Rate it on a scale ranging from 1 to 9, where 1 indicates a positive view of the candidate and 9 indicates a negative view of the candidate.' Below this is a green text box containing an example ad text: 'Joe Biden is going to eat your grandchildren for dinner.' An arrow points down from this text to an illustration of a computer with 'LLM' displayed on its monitor. Finally, an arrow points from the computer down to the number '9' in large teal text, representing the LLM's scoring output. This diagram demonstrates how an LLM directly assigns a numerical score to text based on given criteria

A diagram illustrating pointwise scoring with a large language model (LLM). At the top is a text box containing instructions: 'You will see the text of a political advertisement about a candidate. Rate it on a scale ranging from 1 to 9, where 1 indicates a positive view of the candidate and 9 indicates a negative view of the candidate.' Below this is a green text box containing an example ad text: 'Joe Biden is going to eat your grandchildren for dinner.' An arrow points down from this text to an illustration of a computer with 'LLM' displayed on its monitor. Finally, an arrow points from the computer down to the number '9' in large teal text, representing the LLM's scoring output. This diagram demonstrates how an LLM directly assigns a numerical score to text based on given criteria

LLMs are often used for text annotation, especially in social science. In some cases, this involves placing text items on a scale: eg, 1 for liberal and 9 for conservative

There are a few ways to accomplish this task. Which work best? Our new EMNLP paper has some answers🧵
arxiv.org/pdf/2507.00828

5 months ago 27 8 1 0

Glad to hear ❤️

6 months ago 0 0 0 0

📄 Paper: arxiv.org/abs/2505.19299
💻 Code: github.com/lingjunzhao/PE…
🙏 Huge thanks to my advisor @haldaume3.bsky.social and everyone who shared insights!

6 months ago 1 0 0 0
Post image

🚨 New #EMNLP2025 (main) paper!
LLMs often produce inconsistent explanations (62–86%), hurting faithfulness and trust in explainable AI.
We introduce PEX consistency, a measure for explanation consistency,
and show that optimizing it via DPO improves faithfulness by up to 9.7%.

6 months ago 4 1 2 0
Preview
An Interdisciplinary Approach to Human-Centered Machine Translation Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and...

What should Machine Translation research look like in the age of multilingual LLMs?

Here’s one answer from researchers across NLP/MT, Translation Studies, and HCI.
"An Interdisciplinary Approach to Human-Centered Machine Translation"
arxiv.org/abs/2506.13468

10 months ago 18 7 1 0
QANTA Logo: Question Answering is not a Trivial Activity

[Humans and computers competing on a buzzer]

QANTA Logo: Question Answering is not a Trivial Activity [Humans and computers competing on a buzzer]

Do you like trivia? Can you spot when AI is feeding you BS? Or can you make AIs turn themselves inside out? Then on June 14 at College Park (or June 21 online), we have a competition for you.

10 months ago 0 1 1 0
Advertisement

Super thankful for my wonderful collaborators: @pcascanteb.bsky.social @haldaume3.bsky.social Mingyang Xie, Kwonjoon Lee

11 months ago 2 0 0 0
Post image

We introduce a super simple yet effective strategy to improve video-language alignment (+18%): add hallucination correction in your training objective👌
Excited to share our accepted paper at ACL: Can Hallucination Correction Improve Video-language Alignment?
Link: arxiv.org/abs/2502.15079

11 months ago 7 3 1 0

For the ACL ARR review, I’ve heard complaints about the workload—some reviewers have 16 papers. Even though I only need to write 1 rebuttal and respond to 4, it still feels substantial. For those managing more (thank you!), it can be difficult to thoroughly engage with every rebuttal.

1 year ago 3 0 0 0
Page one of diff.

Page one of diff.

Page 2 of diff.

Page 2 of diff.

Page 3 of diff.

Page 3 of diff.

There is a new version of the Research Plan for NIST's AI Safety Consortium (AISIC) in response to EOs. I did a diff.

Out: safety, responsibility, sociotechnical, fairness, working w fed agencies, authenticating content, watermarking, RN of CBRN, autonomous replication, ctrl of physical systems
>

1 year ago 24 13 2 0
Preview
Causal Effect of Group Diversity on Redundancy and Coverage in Peer-Reviewing A large host of scientific journals and conferences solicit peer reviews from multiple reviewers for the same submission, aiming to gather a broader range of perspectives and mitigate individual biase...

This is my first time serving as an AC for a big conference.

Just read this great work by Goyal et al. arxiv.org/abs/2411.11437

I'm optimizing for high coverage and low redundancy—assigning reviewers based on relevant topics or affinity scores alone feels off. Seniority and diversity matter!

1 year ago 5 2 1 0