*based on the dimensions "human vs. machine" and "realistic vs. comic"
Posts by leonie
I've seen a lot of explanations on similarity measures in vector search but this one by my colleague
@dadoonet is by far the most fun!
How similar* is Han Solo to:
• Princess Leia: very similar
• Obi-Wan: meh
• Darth Vader: complete opposites
Talk slides: david.pilato.fr/talks/2025/2...
What's the most underrated embedding technique you've used?
Static embeddings -> speed-improvements
Binary quantization -> storage-reduction
Late interaction -> added granularity
I'm curious about lesser-known approaches that worked surprisingly well.
Roses are red,
violets are blue,
A good baseline embedding model
is all-MiniLM-L6-v2.
ETA: uses Anthropic‘s citations API
Make RAG results more trustworthy with citations.
In his latest recipe, @danman966.bsky.social shows you how you can build a RAG pipeline with citations, using:
- a @weaviate.bsky.social vector database and
- @anthropic.com's Claude 3.5 Sonnet
📌 Code: github.com/weaviate/rec...
Haha, what specialized topics are you planning to catch up on in the field of AI agents?
Normalize not knowing everything in the AI space.
It's evolving fast.
I’m sure your to-do list is growing as fast as mine.
Here are 3 topics, I want to catch up on this quarter:
• AI agents
• Fine-tuning embedding models
• Multimodality
• (If time permits: reinforcement learning)
What about you?
I’m trying to wrap my head around multi-agent system architectures.
Here are some patterns I’m seeing so far:
1. Type of collaboration:
Network vs. hierarchical
2. Type of information flow:
Sequential vs. parallel vs. loop
3. Type of functionality:
Routing vs. aggregating
What else?
Some considerations for choosing a vector dimension:
1. Data complexity
2. Task complexity
3. Dataset size
4. Computational constraints
5. Performance requirements
6. Scalability requirements
7. Latency requirements
What else?
#1 Rule of RAG Club: Look at your data.
With the new explorer tool, looking at your data got a lot easier in Weaviate Cloud.
The explorer tool provides a graphical interface to easily:
• Browse collections
• Inspect objects, metadata, and vectors
Check it out now: https://buff.ly/3KWivSF
You can be GPU poor like me and still fine-tune an LLM.
Here’s how you can fine-tune Gemma 2 in a Kaggle notebook on a single T4 GPU:
• @kaggle.com offers 30 hours/week of GPUs for free
• @unsloth.bsky.social uses 60% less memory to fit it on a T4 GPU
🔗Code: https://buff.ly/4apUUG2
Although I know that
Vertical scaling: scaling up (to a more powerful machine)
Horizontal scaling: scaling out (to multiple smaller machines)
I still always have to take a second to think about it.
It’s like the left-right-weakness of system design.
I talk about RAG so much, I could fill a book.
So, we did - and you can download it for free.
Together with my colleagues Mary & Prajjwal, we curated an e-book of the most effective advanced RAG techniques.
Which ones did we miss?
Get it now: weaviate.io/ebooks/advan...
Over the holidays, I learned how to fine-tune an LLM.
Here’s my entry for the latest @kaggle.com comp.
This tutorial shows you:
• Fine-tune Gemma 2
• LoRA fine-tuning with @unsloth.bsky.social on T4 GPU
• Experiment tracking with @weightsbiases.bsky.social
🔗Code: www.kaggle.com/code/iamleon...
Thanks! Merry Christmas to you, too, Tomaz!
Got myself a little early Christmas present.
Although this book is from 2017, I heard so many good things about it this year.
Can't wait to dig into this over the holidays.
And with that being said, I hope you have some nice and relaxing holidays yourself!
See you in the new year!
To make it a little bit more fun, I’m making some bolder predictions for 2025 this time:
• Video will be an important modality
• Moving from one-shot to agentic to human-in-the-loop
• Fusion of AI and crypto
• Latency and cost per token will drop
What other trends are you observing in the AI space?
It’s time to review the AI space in 2024!
Here’s what I got right (and what I missed) in my 2024 predictions:
✅ Evaluation
❌ Multimodal foundation models
❌ Fine-tuning open-weight models and quantization
❌ AI agents
✅ RAG lives on
❌ Knowledge graphs
medium.com/towards-data...
日本語テキスト向けのハイブリッド検索には日本語テキス用のトークナイザーが必要です。
@weaviate.bsky.socialでは3つのトークナイザーを使用することができます。
一つずつのメリットとデメリットはこちら
weaviate.io/blog/hybrid-...
Struggling to keep up with new RAG variants?
Here’s a cheat sheet of 7 of the most popular RAG architectures.
Which variants did we miss?
ハイブリッド検索とは何?
ハイブリッド検索は、デンスベクトルとスパースベクトルを統合して、それぞれの検索手法の利点を活かします。
この記事では、Weaviateの日本語テキスト向けのハイブリッド検索の説明をします。
- 日本語テキス用のトークナイザーを使用するキーワード検索
- ベクトル検索
- 融合アルゴリズム
詳しくはこちら
https://buff.ly/49yMR9K
Yaaaay!
By the way: The star fish on the cover makes a special appearance in the book. Did you spot it?
📌 Link to the book: www.oreilly.com/library/view...
Look what came in the mail today!
This is already the 2nd edition of “Developing apps with GPT-4” by Olivier and Marie-Alice I had the pleasure to review.
This edition covers the latest advancements in GPT-4, especially regarding its visual capabilities to build multimodal applications.
Oh, this is so neat. Thanks for sharing. Can’t wait to dig in.
It's been two years since the release of ChatGPT.
What cool use cases using Generative AI have you seen in the wild so far?
Here’s a recipe notebook by Mary on RAG over PDF files using Docling and @weaviate.bsky.social.
github.com/weaviate/rec...
Struggling with RAG over PDF files?
You might want to give Docling a try.
𝗪𝗵𝗮𝘁'𝘀 𝗗𝗼𝗰𝗹𝗶𝗻𝗴?
• Python package by IBM
• OS (MIT license)
• PDF, DOCX, PPTX → Markdown, JSON
𝗪𝗵𝘆 𝘂𝘀𝗲 𝗗𝗼𝗰𝗹𝗶𝗻𝗴?
• Doesn’t require fancy gear, lots of memory, or cloud services
• Works on regular computers or Google Colab Pro