Advertisement ยท 728 ร— 90

Posts by Kathy

Post image

How Important is โ€˜Perfectโ€™ English for Machine Translation Prompts?
by @patuchen.bsky.social, @niyatibafna.bsky.social, @sethjsa.bsky.social , @gianlucavico.bsky.social ,W. Kamzela, @kathaem.bsky.social & @zouharvi.bsky.social
aclanthology.org/2026.finding...
TL;DR: Prompt errors < prompt choice

4 weeks ago 16 2 1 2

This is indeed delightful, thanks for posting! Their channel seems to have whole albums' worth of similar songs but not sure if any of the others have subtitles or dancing ๐Ÿ’ƒ

5 months ago 1 0 1 0

@aclanthology.org not sure where to report, but in the last few months I've often had issues with long loading times/timeouts on aclanthology.org. It's particularly bad today---maybe related to the upcoming ARR deadline?

6 months ago 0 0 1 0

Idk about "primarily" mate

8 months ago 1 0 0 0

You mean the most popular *US* politicians on this list

8 months ago 0 0 0 0

Personally, sleeping more and vitamin D in the winter.

...sorry, not much of a baker

8 months ago 0 0 0 0

@aclrollingreview.bsky.social Why is the reviewing window (still) so short this cycle? Wasn't the cycle extended to ten weeks specifically to make the process more manageable? Wasn't it three weeks in past cycles? Instead reviewers don't even get two full weeks to handle 4+ submissions.

10 months ago 0 0 0 0
Preview
TokShop 2025 Registering interest in all things tokenization at TokShop @ ICML 2025 (July 18) Consider joining the Google group for future updates! https://groups.google.com/g/tokshop

TokShop @ #ICML2025 got way more submissions than expected! ๐Ÿ“ˆ We could really use a few more reviewers to help out. If you have the capacity to review a #tokenization paper by Saturday, please fill out this form: forms.gle/32A6sQHQrMSb... ๐Ÿ™

10 months ago 0 4 0 2

Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." ๐Ÿ–ผ๏ธโžก๏ธ๐Ÿ”ค #VisualTokenization

Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!

10 months ago 4 2 0 0

I'll be presenting this paper in Gather Town (Session 1) in a few hours ๐ŸŽŠ Come along!

11 months ago 0 0 0 0
Advertisement
Preview
When ChatGPT Broke an Entire Field: An Oral History | Quanta Magazine Researchers in โ€œnatural language processingโ€ tried to tame human language. Then came the transformer.

This is a fantastic oral history of the last 10 years of NLP and AI. www.quantamagazine.org/when-chatgpt...

11 months ago 94 29 2 4

As a second language English speaker this also confused me for so long. Eventually I decided it must be from the phrase "having cake" which also means eating the cake

1 year ago 0 0 0 0
Me posing with my poster

Me posing with my poster

The tour guide standing next to a statue of Professor Lichtenberg.

The tour guide standing next to a statue of Professor Lichtenberg.

A slide of the vocabulary learning algorithm "SaGe"

A slide of the vocabulary learning algorithm "SaGe"

Just spent two days in Gรถttingen at #HumanCLAIM workshop! Re-presented my poster on surveying methods for cross-lingual representation alignment, got a city tour, heard cool talks and had interesting conversations ๐Ÿ’ฌ๐Ÿ’ญ

1 year ago 5 0 0 0

Oh very nice to see a paper for this intuition, and the data could be very useful! Adding to the reading list ๐Ÿ‘€

1 year ago 0 0 0 0
Figure 1: Eflomal score (bottom), a measure of token alignability, predicts downstream transfer performance better than the previous metric of distributional token
overlap (top). The difference is especially stark for language pairs with different scripts (โ€ข), compared to language pairs with the same script (ร—). The orange line shows the linear fit across all included pairs.

Figure 1: Eflomal score (bottom), a measure of token alignability, predicts downstream transfer performance better than the previous metric of distributional token overlap (top). The difference is especially stark for language pairs with different scripts (โ€ข), compared to language pairs with the same script (ร—). The orange line shows the linear fit across all included pairs.

Alignability is more predictive of cross-lingual transfer than divergence of literal token distributions, particularly for language pairs with disparate scripts.

1 year ago 2 0 0 0

Basically we argue that token overlap measures for predicting multilingual performance are too literal, and introduce the notion of **token alignability**, which can be measured via the scores of a statistical aligner over a corpus tokenised with a given tokenised.

1 year ago 3 0 1 0
Preview
Beyond Literal Token Overlap: Token Alignability for Multilinguality Previous work has considered token overlap, or even similarity of token distributions, as predictors for multilinguality and cross-lingual knowledge transfer in language models. However, these very li...

Happy to say that our paper "Beyond Literal Token Overlap: Token Alignability for Multilinguality" will be presented at #NAACL2025!

This is work with @tomlim.bsky.social, @jlibovicky.bsky.social, and Alex Fraser.

arxiv.org/abs/2502.06468

#newpaper #NLP #NLProc

1 year ago 10 2 1 2
Post image

Following the MT Marathon, we're hosting a hackathon in Prague. Researchers and students from five institutions (+1 online) are working together to assess how robust #LLMs are to grammar errors in machine translation and related tasks. Thanks to EAMT for their support.

1 year ago 18 2 0 0

@queerinai.com Hi, I was invited to review for the workshop the other day but the email is not clear on when reviews will be due. This info will be important to decide if I'm able to serve; can you share the deadlines? Thanks!

1 year ago 0 0 0 0

๐Ÿ“Œ

1 year ago 0 0 0 0
Advertisement

Gotta say I'm not sure what pronunciation "luh-BOEV" is referring to but in my head it sounds like French beef

1 year ago 9 0 1 0

Germany. a) ground floor b) first floor. This matches how we count in German but the German terms basically treat the "upper floors" separately from the "ground floor"

1 year ago 0 0 0 0

Bill Labov died this morning. I'm not coherent enough to talk about how important and influential and brilliant he was. I am very sad.

I was so lucky to know him, and I am grateful every day that he (and Gillian, and Walt, etc) built an academic field where kindness is expected.

1 year ago 699 120 24 25

To add to the reviewing complaints ๐Ÿ˜… Why do authors so often respond with an absolute wall of text? (Biggest response I got this time was four comments long.) As a reviewer, I find this very tough to engage with in the short discussion period, and as an author, I try to be concise in my responses.

1 year ago 0 0 0 0

5k is a small town, honestly ๐Ÿ˜‚

1 year ago 2 0 0 0

Just wanted to say a quick thank you for organising a lovely social! ๐ŸŽŠ๐ŸŒˆ

1 year ago 0 0 1 0

Right now the app is being very laggy though?

1 year ago 2 0 0 0

Today I finally deactivated my Twitter account (not that I'd been super active there but hey) and decided to check out Bluesky. Looks like there's already a LOT of people here!

1 year ago 5 0 1 0