I applied LLMs for query expansion and we wrote this article:
It sees to work out-of-the-box and generally boost the performance of embedding models. However, it requires more latency. Would be interesting to see more about this.
๐: jina.ai/news/query-e...
๐ ๏ธ: github.com/jina-ai/llm-...
Posts by bo
Our submission to ECIR 2025 on jina-embeddings-v3 has been accepted! ๐
At the ECIR Industry Day my colleague @str-saba.bsky.social presents how we train the latest version of our text embedding model.
More details on ECIR: ecir2025.eu
More details about the model: arxiv.org/abs/2409.10173
@bowang0911.bsky.social showed me this cool paper that I'd never read before, about Masked Autoencoder (MAE) for images. The idea: an image encoder encodes non-masked patches, followed by a decoder using the non-masked and masked embeddings to regenerate the original image arxiv.org/abs/2111.06377
I'm looking for an intern to introduce Sparse Embedding models to Sentence Transformers! If you're passionate about open source, interested in helping practitioners use your tools, and enjoy embedders/retrievers/rerankers, then I'd love to hear from you!
Links with details and to apply in ๐งต
It's pretty sad to see the negative sentiment towards Hugging Face on this platform due to a dataset put by one of the employees. I want to write a small piece. ๐งต
Hugging Face empowers everyone to use AI to create value and is against monopolization of AI it's a hosting platform above all.
let's embed them!