Ondra Platek (@oplatek) Bsky

New preprint: arxiv.org/abs/2603.20133

Does performance on reasoning benchmarks transfer to real-world settings such as task-oriented dialogue?

Not necessarily: our new benchmark tests LLMs on problems framed in both standalone and dialogue settings, and shows that dialogue makes reasoning harder.

4 weeks ago 10 5 1 1

Endure: Mind body and the curiously elastic limits of human performance by Alex Hutchinson.

Good theory before to learn anything from any Dummies series book

2 months ago 0 0 0 0

Interesting did not know that arxiv requires peer-reviewed papers
arxiv.org/abs/2601.17036

2 months ago 3 0 0 1

Pulse: Community News App - App Store Download Pulse: Community News by BottleCap AI on the App Store. See screenshots, ratings and reviews, user tips and more games like Pulse: Community News.

apps.apple.com/cz/app/pulse...

2 months ago 0 0 0 0

Jaroslav Beck a Tomáš Mikolov získávají 150 milionů pro svůj AI startup. Investuje do něj i Ivo Lukačovič Český startup BottleCap AI chce zefektivnit trénování modelů umělé inteligence. Má na to novou investici a uvádí první aplikaci pro iOS.

Czech news adopted it fast cc.cz/jaroslav-bec...

2 months ago 0 0 1 0

BottleCapAI Announcement, Seed Round, Pulse: Community News YouTube video by BottleCapAI

Our Pulse News App -- our first app is live on iOS today.

We also announce star investors cap.
See the announcement video to see our plans.
If you are LLM guru at research or product building, join us to make the plans a reality
youtu.be/cA8-XJeoZAQ?...

2 months ago 3 0 3 0

Thank you to all coauthors to push the article to acceptance!

2 months ago 0 0 0 0

In general using LLM annotations for self-improvement is great future direction

2 months ago 2 0 1 0

Annotations using LLM are pretty under researched topics.

I would be interested how it performs on coding tasks and if it helps during CoT for coding tasks

2 months ago 0 0 1 0

Well done @zdenekkasner.bsky.social et al!

LLMs as Span Annotators: A Comparative Study of LLMs and Humans is accepted to multilingual-multicultural-evaluation.github.io 🎉

See paper arxiv.org/abs/2504.08697

2 months ago 8 2 2 1

What you plan to read on vacation?

2 months ago 0 0 1 0

Dr.LLM: Adaptive Layer Routing for Efficient Inference

Avoiding computation is pretty obvious way how to be more efficient.
The question is how to maintain quality. MoE and router is an efficient approach but suffer from exposure bias fixed thresholds.

arxiv.org/abs/2510.12773

2 months ago 2 0 3 0

so what it is?

2 months ago 0 0 1 0

What do you want to do with the deeper layers

2 months ago 1 0 0 0

Do you think future is MoE or dense models?

2 months ago 0 0 1 0

Qwen3 Embedder is amazing
but it is tricky to use.
Left padding confused lot of early adopters.

However, clearly it is the future. Reuse of pre-trained LLM for everything.
Including vector search where BERT-like models ruled.

1/N

arxiv.org/abs/2506.05176
huggingface.co/Qwen/Qwen3-E...

2 months ago 1 0 5 0

So would you like to prune them?

2 months ago 0 0 0 0

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Diffusion Large Language Models (dLLMs) break the rigid left-to-right constraint of traditional LLMs, enabling token generation in arbitrary orders. Intuitively, this flexibility implies a solution sp...

The no free lunch idea is here again.

Saw it in TTS models where Guided Attention worked very well. Idea is simple: For speech conversion prefer guide the attention to focus on narrow context. It learns faster.

Looking forward to deep dive on this one - expect similarities
arxiv.org/abs/2601.15165

2 months ago 1 0 1 0

It can be used in RAG, so for fetching relevant documents in clawbot it makes very much sense

2 months ago 0 0 0 0

The last token pooling is compressing the embeddings too much.

However it is engineering marvel.
Including tricks like the spherical last checkpoint averaging is awesome.

2 months ago 0 0 0 0

👀 WE ARE LOOKING FOR YOU! (yes, you!) 🚨NoCap test deadline 11/11/2025 🏆 $3K / $2K / $1K prizes Our new team members shared tips&tricks which helped them to go through our test! TRY IT NOW! 👉… | ... 👀 WE ARE LOOKING FOR YOU! (yes, you!) 🚨NoCap test deadline 11/11/2025 🏆 $3K / $2K / $1K prizes Our new team members shared tips&tricks which helped them to go through our test! TRY IT NOW! 👉 https:/...

The video is on LinkedIn. Check it out it turn out fun www.linkedin.com/posts/bottle...

5 months ago 0 0 0 0

LinkedIn This link will take you to a page that’s not on LinkedIn

👀 WE ARE LOOKING FOR YOU! (yes, you!)
🚨NoCap test deadline 11/11/2025
🏆 $3K / $2K / $1K prizes

Our new team members shared tips&tricks which helped them to go through our test!
TRY IT NOW! 👉 github.com/BottleCapAI/...

5 months ago 0 0 1 0

Czech President Delivers Powerful Speech at UN, Slams Russia, Iran, Israel Aggression | AC1G YouTube video by DRM News

The right words at the right time with the right style for the venue. Proud of our president 🇨🇿
youtu.be/d3wT84egi-g?...

6 months ago 2 0 0 0

Me: Scaling algorithms is just much more fun, than just burning electricity on more and more GPUs.

6 months ago 0 0 0 0

GitHub - BottleCapAI/NoCap-Test: Open Test for BottleCapAI Open Test for BottleCapAI. Contribute to BottleCapAI/NoCap-Test development by creating an account on GitHub.

Official BottleCapAI: we believe AI shouldn’t cost tens of millions to train.
We are now opening challenge for those who want to help with that!

🏆 $3K / $2K / $1K prizes
⏰ Deadline: 11/11/2025
🖥️ 1 GPU. Your ideas.

👉 Join the NoCap Test:

github.com/BottleCapAI/...

6 months ago 1 0 1 0

Terminology Translation Task

📣Take part in 3rd Terminology shared task @WMT!📣
This year:
👉5 language pairs: EN->{ES, RU, DE, ZH},
👉2 tracks - sentence-level and doc-level translation,
👉authentic data from 2 domains: finance and IT!

www2.statmt.org/wmt25/termin...

Don't miss an opportunity - we only do it once in two years😏

10 months ago 3 2 0 2

Algorithmic Simplicity This is an educational channel for all things algorithmic, including but not limited to computer science, machine learning, physics, and mathematics.

I just found awesome channel on YouTube.
It has 77k subscribers with just 7 videos.
Because the videos are just awesome!

I wish I could explain stuff that simply!

Does anybody know how long take to prepare such video?

www.youtube.com/@algorithmic...

11 months ago 0 0 0 0

Thanks to all my awesome collaborators!

👉️ Vilém @zouharvi.bsky.social
👉️ Patrícia @patuchen.bsky.social
👉️ Ivan @ivankartac.bsky.social
👉️ Kristýna Onderková
👉️ Ondřej P. @oplatek.bsky.social
👉️ Dimitra @dimitrag.bsky.social
👉️ Saad @saad.me.uk
👉️ Ondřej D. @tuetschek.bsky.social
👉️ Simone Balloccu

1 year ago 5 3 1 0

Large Language Models as Span Annotators Website for the paper Large Language Models as Span Annotators

How do LLMs compare to human crowdworkers in annotating text spans? 🧑🤖

And how can span annotation help us with evaluating texts?

Find out in our new paper: llm-span-annotators.github.io

Arxiv: arxiv.org/abs/2504.08697

1 year ago 20 7 1 2

Čech povede evropský výzkum AI - 3. února 05:59 - Studio 6 | Česká televize Jan Hajič

We've been making the media rounds!
👉📺 @hajicjan.bsky.social talked about the new OpenEuroLLM project on Czech TV's Studio 6 www.ceskatelevize.cz/porady/10969...
👉📻 @tuetschek.bsky.social discussed #LLMs on Czech Radio radiozurnal.rozhlas.cz/proc-umela-i...).

1 year ago 7 3 0 1

Posts by Ondra Platek