Advertisement · 728 × 90

Posts by Maria Khalusova

Really looking forward to food-induced coma naps

4 months ago 1 0 0 0

What's going to be different for you in 2026?

4 months ago 0 0 1 0

Just entered my next decade and I think it’ll be the best one yet.

4 months ago 2 0 1 0

Once again, I completely forgot that I have this account. Oops

6 months ago 1 0 0 0

A tiny bit of mirroring? :)

10 months ago 0 0 1 0

PS: That said, I’ll probably still keep an eye on what’s happening and may even share some posts every now and then. I’ve got a lot of thoughts on RAG, data processing, LLMs/VLMs, etc., so I likely won’t disappear fully.

10 months ago 2 0 0 0

The work will still be here when I return. The AI won’t slow down, but also a couple of months won’t make a dent in the field. This moment, however, this chance to be fully present with my family? That’s something I don’t want to miss.

10 months ago 2 0 1 0

And even more grateful to work with a team that’s so supportive. Stepping away from work, especially in a field moving at warp speed, can feel counterintuitive. But for me, it’s a way to reconnect with what matters most.

10 months ago 2 0 1 0

Kids won’t be kids forever, and mine are getting ever so close to becoming teenagers. Now is time I know I’ll never get back.
I’m incredibly grateful to be in a place, both professionally and personally, where this is possible.

10 months ago 2 0 1 0

Next week, I’m stepping away for a couple of months to take a sabbatical and spend time with my kids. I’m not burnt out. I’m following my own advice: do the thing you’ll regret not doing when you’re old.

10 months ago 4 0 1 0
Advertisement

Things move fast in AI. Every week brings new models, new capabilities, or new ideas to chase. It’s exciting, but also easy to get swept up in the pace and forget to pause, to touch grass, to zoom out and see the bigger picture.
🧵

10 months ago 2 0 1 0

RAG exists to solve different problems across varied domains. Understand the problem you’re solving and look at your data.

10 months ago 1 0 0 0

Once you have some answers to these, you can get further into the technical weeds and experiment with chunking to find an optimal size.
Bottom line, however, is - there's no universal "best" chunk size.

10 months ago 0 0 1 0

* How much context do you typically need to retrieve to satisfy a typical query? Simple facts may only require a sentence or two. Creative tasks may require larger context. Analytical queries may need a whole bunch of supporting evidence.

10 months ago 0 0 1 0

They all vary in structure, style, and length.
* What is your use case? Are you trying to answer questions with specific facts? Are you gathering multiple documents to summarize for a report? Do you pull from transcripts and need to preserve speaker attribution?

10 months ago 0 0 1 0

Same goes for chunking. The “best” chunk size depends on a range of factors, and without those, the question is incomplete.
Here are some of the questions to ask instead:
* What does your data look like? Financial statements, technical manuals, customer support transcripts are not the same.

10 months ago 0 0 1 0

Asking “What is the best chunk size for RAG?” without any additional context is like asking, “What’s the best thing to wear?” Wear where? What’s the weather like? What size are you? Are you going to a wedding or hiking a trail? There’s no single answer that works for every situation.

🧵

10 months ago 0 0 1 0

Do the thing that you will regret not doing when you're old.

10 months ago 2 0 0 0
Advertisement
Post image

I went to check what new courses deeplearning.ai has, and was pleasantly surprised to see that the short course Marc Sun, Younes Belkada, and I have built over a year ago is still featured as one of the Top Rated courses 😍

10 months ago 6 0 0 0

At least I have interrupted your doomscrolling with some cuteness!

11 months ago 2 0 0 0
Post image

I'm taking this whole developer becoming a farmer dream way too far, am I?

11 months ago 1 0 1 0

If you've been prioritizing urgent work,
make sure to prioritize important work.

11 months ago 2 0 0 0

How anyone can like peanut butter is beyond me.

11 months ago 0 0 1 0

Similar ≠ relevant

11 months ago 1 0 0 0
Preview
Level Up Your GenAI Apps: Overview of Advanced RAG Techniques – Unstructured Explore advanced RAG retrieval techniques—including re-ranking, hybrid search, metadata filtering, and Agentic RAG—that go beyond basic vector similarity to deliver more relevant, high-precision resul...

Part 2 is a high-level overview of advanced RAG techniques: unstructured.io/blog/level-u...

11 months ago 1 0 0 0

Nothing starts a Wednesday morning quite like your dog getting sprayed by a skunk 🤢

11 months ago 2 0 1 0

I have some epic plans for this summer and none of you’ll be able to guess what they are.

11 months ago 0 0 1 0
Preview
Level Up Your GenAI Apps: RAG Beyond the Basics – Unstructured Learn why naive implementations fall short—and how smarter data preprocessing lays the foundation for reliable, high-performance RAG.

I'm starting a series of blog posts on RAG beyond the basic set up. In the first part, we're setting the stage. Why naive RAG is not enough, and how a lot of the issues can be traced back to data processing choices.
Part 1: unstructured.io/blog/level-u...

11 months ago 5 0 1 0
Advertisement

What you're not changing, you're choosing.
This is a gentle reminder for the next time you're prioritizing a cool new shiny thing over building the foundation or addressing tech debt.

11 months ago 1 0 0 0

Word of the day seems to be "sycophantic".
Thanks AI community for increasing my vocabulary :)

11 months ago 0 0 0 0