Advertisement Β· 728 Γ— 90

Posts by Tokenization Workshop (TokShop) @ICML2025

πŸŽ₯ Videos of our invited talks and the panel discussion are now also available on YouTube: www.youtube.com/@tokenizatio... ▢️

7 months ago 6 2 0 0
Tokenization Workshop (TokShop)ICML 2025

πŸŽ₯ Videos from our Tokenization Workshop are now live! Watch invited talks, panel discussions, and the best paper presentation at icml.cc/virtual/2025... #Tokenization #NLP #LLMs

7 months ago 16 7 1 1
Post image

πŸ† Announcing our Best Paper Awards!
πŸ₯‡ Winner: "BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization" openreview.net/forum?id=AO7...
πŸ₯ˆ Runner-up: "One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression" openreview.net/forum?id=lC4...
Congrats! πŸŽ‰

9 months ago 14 0 0 1
Post image

πŸ”₯ The Tokenization Workshop is happening NOW, and we have a packed room! It's great to see so much interest in tokenization research. #ICML2025 #Tokenization #LLM #NLP

9 months ago 11 0 0 0
Post image Post image Post image

Three invited speakers will share their insights at TokShop! Hear from Yuval Pinter @uvp.bsky.social, Desmond Elliott @delliott.bsky.social, and Adrian ŁaΕ„cuck on cutting-edge tokenization research. Don't miss these keynote presentations! #ICML2025 tokenization-workshop.github.io/speakers

9 months ago 9 2 0 0
Post image

🎀 Meet our expert panelists! Join Albert Gu, Alisa Liu, Kris Cao, Sander Land, and Yuval Pinter as they discuss the Future of Tokenization on July 18 at 3:30 PM at TokShop at #ICML2025.

9 months ago 7 2 0 2
Post image

The TokShop schedule is now live! Join us at #ICML2025 for invited talks, poster sessions, and a panel on the future of tokenization. tokenization-workshop.github.io/schedule #Tokenization #LLM #NLP

9 months ago 11 3 0 1
Preview
TokShop 2025 Registering interest in all things tokenization at TokShop @ ICML 2025 (July 18) Consider joining the Google group for future updates! https://groups.google.com/g/tokshop

TokShop @ #ICML2025 got way more submissions than expected! πŸ“ˆ We could really use a few more reviewers to help out. If you have the capacity to review a #tokenization paper by Saturday, please fill out this form: forms.gle/32A6sQHQrMSb... πŸ™

10 months ago 0 4 0 2

πŸ“£ We extend the submission deadline by 24 hours to avoid conflict with ACL camera-ready deadline.

πŸ“… New Submission Deadline: May 31, 2025 (23:59 AoE)

πŸ“© OpenReview: openreview.net/group?id=ICM...

10 months ago 1 1 0 0
Advertisement

Got a good tokenization paper under review at COLM, but the scores were a letdown? 😬

Why bother with rebuttal when the perfect venue is right around the corner!

Submit your paper to the #ICML2025 Tokenization Workshop (TokShop) by May 30! πŸš€

10 months ago 11 4 0 0

Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." πŸ–ΌοΈβž‘οΈπŸ”€ #VisualTokenization

Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!

10 months ago 4 2 0 0

Got a tokenization paper rejected from ACL? Didn't submit to EMNLP/NeurIPS? Want to present your ACL/EMNLP/NeurIPS work non-archivally? Submit to TokShop @ ICML 2025!
The deadline is already May 30!
openreview.net/group?id=ICM...
tokenization-workshop.github.io

10 months ago 3 2 0 0
Tokenization Workshop @ ICML 2025

Language matters: Low-resource languages are severely overtokenized: While English uses ~1.2 tokens per word, e.g., Tamil requires more tokens than characters, making #LLMs much costlier for billions of speakers! πŸ’ΈπŸŒ

Check out our ICML workshop πŸ”— tokenization-workshop.github.io

11 months ago 3 0 0 0
Tokenization Workshop @ ICML 2025

Did you know BPE (Byte Pair Encoding), the most common LLM tokenizer, was originally a compression algorithm from 1994? #Tokenization #LLM #NLP

Want to find out more about tokenization? Attend our workshop at ICML! tokenization-workshop.github.io

11 months ago 0 0 0 0
Preview
ICML 2025 Workshop TokShop Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop

πŸ“ Submit papers (up to 9 pages, shorter submission ) via OpenReview: openreview.net/group?id=ICM...

πŸ—“οΈ Important dates:
Deadline: May 30, 2025
Notifications: June 9, 2025
Workshop: July 18, 2025
Both archival and non-archival options available! #ICML2025 #TokShop #ML #NLP

11 months ago 0 0 0 0
Preview
ICML 2025 Workshop TokShop Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop

πŸ“£ Call for Paper Alert: TokShop @ ICML 2025
TokShop explores tokenization across all data modalities. Topics include: subword NLP techniques, multimodal approaches, multilingual challenges, post-training modification, alternative representations, and statistical perspectives.

11 months ago 18 12 1 2
Tokenization Workshop @ ICML 2025

Got a tokenization paper that just didn't make the cut for ICML? Submit it to the Tokenization Workshop TokShop at #ICML2025 -- we'd love to see it there!
tokenization-workshop.github.io

11 months ago 7 6 0 0

TokShop is organized by an amazing team of researchers passionate about tokenization:
@tomlim.bsky.social, @valentinhofmann.bsky.social, @shocheen.bsky.social, @jlibovicky.bsky.social, @jindrahelcl.bsky.social, @orevaahia.bsky.social,
@esalesky.bsky.social, @smfsamir.bsky.social

1 year ago 3 0 0 0

In the upcoming weeks, we will announce an exciting line-up of invited talks and panelists. Follow our account
@tokshop.bsky.social to stay tuned.

Join us at TokShop at #ICML2025!

1 year ago 2 0 1 0
Advertisement

We're looking for papers on tokenization in text, vision, audio, multimodal, and more.

πŸ“ Up to 9 pages (shorter welcome!)
πŸ” Double-blind review
πŸ“š Archival and non-archival options available

1 year ago 1 0 1 0

There has been a lot of chatter about tokenization for LLMs over the last few months, but tokenization goes beyond text-based models.

It's time we bring the NLP and ML communities together to explore this foundational topic. Let's talk about tokenization at TokShop!

1 year ago 2 0 1 0
Tokenization Workshop @ ICML 2025

🚨 NEW WORKSHOP ALERT 🚨

We're thrilled to announce the first-ever Tokenization Workshop (TokShop) at #ICML2025 @icmlconf.bsky.social! πŸŽ‰

Submissions are open for work on tokenization across all areas of machine learning.

πŸ“… Submission deadline: May 30, 2025
πŸ”— tokenization-workshop.github.io

1 year ago 23 7 1 4

TokShop is organized by an amazing team of researchers passionate about tokenization: @tomlim.bsky.social, @valentinhofmann.bsky.social, @shocheen.bsky.social, @jlibovicky.bsky.social, @jindrahelcl.bsky.social, @orevaahia.bsky.social, @esalesky.bsky.social, @smfsamir.bsky.social

1 year ago 0 0 0 0

In the upcoming weeks, we will announce an exciting line-up of invited talks and panelists. Follow our account @tokshop.bsky.social to stay tuned.

Join us at TokShop at #ICML2025! @icmlconf.bsky.social

1 year ago 0 0 1 0

We're looking for papers on tokenization in text, vision, audio, multimodal, and more.

πŸ“ Up to 9 pages (shorter welcome!)
πŸ” Double-blind review
πŸ“š Archival and non-archival options available

1 year ago 0 0 1 0