Clelia Astra Bertelli (@cle-does-things) Bsky

Stuttgart AI, ML and Computer Vision Meetup - April 21, 2026 Join our in-person meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.

I'll be speaking at the upcoming Voxel51 event in Stuttgart this Tuesday!🚀
I'll talk about the anatomy of AI agents, with a focus on document agents and building good harnesses for them⚙️
Swing by if you're interested, and check the official page for more details: voxel51.com/events/stut...

2 days ago 2 0 0 0

GitHub - AstraBert/sunbears: A CSV data loader for TypeScript with an API similar to Polars and Pandas, written in pure Rust. A CSV data loader for TypeScript with an API similar to Polars and Pandas, written in pure Rust. - AstraBert/sunbears

PS, here's the repo: github.com/AstraBert/s...

3 days ago 0 0 0 0

📝I decided to write a blog post about it, mostly to document the building journey for myself, but also to share my thinking and coding process around sunbears: clelia.dev/blog/2026-0...

Enjoy!✨

3 days ago 0 0 1 0

I've been building sunbears, a typescript library for CSV data loading written in Rust🦀

3 days ago 1 0 1 0

llamaindex/ParseBench · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Explore more:
📖 Blog: www.llamaindex.ai/blog/parseb...
💻 Code: github.com/run-llama/P...
🤗 Dataset: huggingface.co/datasets/ll...

1 week ago 0 0 0 0

What makes it different?
ParseBench optimizes for semantic correctness, not exact text matching.
That means evaluating whether parsed outputs are actually useful for humans and AI agents making downstream decisions 🔍

1 week ago 1 0 2 0

It includes:
• 2,000+ human-reviewed enterprise documents
• 167,000 evaluation rules
• Coverage across 5 key areas: tables, charts, content faithfulness, semantic formatting, and visual grounding

1 week ago 0 0 1 0

ParseBench is here!📊

We’ve just released ParseBench, an open benchmark + dataset for evaluating document parsing at scale.

1 week ago 0 0 1 0

GitHub - AstraBert/sunbears: A CSV data loader for TypeScript with an API similar to Polars and Pandas, written in pure Rust. A CSV data loader for TypeScript with an API similar to Polars and Pandas, written in pure Rust. - AstraBert/sunbears

👩‍💻 Repo: github.com/AstraBert/s...

1 week ago 0 0 0 0

sunbears uses 𝘋𝘢𝘵𝘢𝘍𝘳𝘢𝘮𝘦 as its primary data structure, a columnar format with strict typing, null and NaN filtering, and convenient column-to-array transformations📄
📦 Get started now: 𝘯𝘱𝘮 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 @𝘤𝘭𝘦-𝘥𝘰𝘦𝘴-𝘵𝘩𝘪𝘯𝘨𝘴/𝘴𝘶𝘯𝘣𝘦𝘢𝘳𝘴

1 week ago 0 0 1 0

I'm building 𝘀𝘂𝗻𝗯𝗲𝗮𝗿𝘀, a CSV data loader library for TS written in Rust🦀
In Node, it can read a file with 1.000.000 rows in 0.3s, write the same amount of rows in 0.15s, respectively 4x and 2x faster than the `csv` package⚡

1 week ago 1 0 1 0

Visually rich documents are especially challenging for agents.
Tables, charts, and images often break traditional document pipelines, making complex reasoning difficult📄

So we teamed up with LanceDB to build a structure-aware PDF QA pipeline🚀

Here’s how it works:

1 week ago 1 1 2 0

Smart Parsing Meets Sharp Retrieval: Combining LiteParse and LanceDB Build a structure-aware PDF QA agent with LiteParse, LanceDB, and Claude to answer complex questions over visually rich documents.

Read the full breakdown here: www.lancedb.com/blog/smart-...

1 week ago 0 0 0 0

With our eval dataset, the agent got near-perfect scores on most complex QA tasks, showing how a strong parsing foundation and multimodal retrieval can really improve your search🚀

1 week ago 0 0 1 0

- Parse files and take page-level screenshots with LiteParse, the parser we just open sourced at LlamaIndex
- Chunk and embed text, and store everything (text, image bytes, vector data) in a local LanceDB instance
- Expose text and image retrieval tools to a Claude agent, and let it reason on both

1 week ago 0 0 1 0

How can you improve your agentic search pipeline?
I just wrote a blog post in collab with LanceDB to answer exactly that.
TLDR:

1 week ago 0 0 1 0

Securing AI Document Agents with LlamaIndex and Auth0 Learn how to build secure AI document agents using LlamaIndex Workflows and Auth0 FGA. Implement fine-grained, relationship-based access control for RAG.

📚 Learn how it works in the blog post: auth0.com/blog/securi...
🦙 Get started with LlamaParse: cloud.llamaindex.ai/signup

2 weeks ago 0 0 0 0

That’s why we teamed up with @auth0byokta.bsky.social to build a real-world demo of a secure document processing and retrieval pipeline, powered by fine-grained authentication so only trusted actors can access specific content.

2 weeks ago 0 0 1 0

That starts with powerful document processing building blocks like LlamaParse and LlamaExtract, but great agents also need the right access controls, as they should only see the documents they’re authorized to use.

2 weeks ago 0 0 1 0

At @llamaindex.bsky.social, we're committed to building the most capable document agents.

2 weeks ago 0 0 1 0

📝 PS: I'll follow up with a blog post on my experience while creating this library!

2 weeks ago 0 0 0 0

GitHub - AstraBert/sunbears: A CSV data loader for TypeScript with an API similar to Polars and Pandas, written in pure Rust. A CSV data loader for TypeScript with an API similar to Polars and Pandas, written in pure Rust. - AstraBert/sunbears

For now, sunbears focuses on fast CSV reading, but I’m planning to expand the library further and keep improving performance over time 🚀

⭐ Give it a star: github.com/AstraBert/s...
📦 Install with 𝘯𝘱𝘮 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 @𝘤𝘭𝘦-𝘥𝘰𝘦𝘴-𝘵𝘩𝘪𝘯𝘨𝘴/𝘴𝘶𝘯𝘣𝘦𝘢𝘳𝘴

2 weeks ago 0 0 1 0

In benchmarks, sunbears can load a CSV with 1 million rows in about 0.4 seconds, making it roughly 3× faster than 𝘤𝘴𝘷-𝘱𝘢𝘳𝘴𝘦, although still about 2× slower than Polars in Python ⚖️

2 weeks ago 0 0 1 0

𝘀𝘂𝗻𝗯𝗲𝗮𝗿𝘀 converts CSV files into a DataFrame, a tabular data structure with strictly typed columns whose values can be easily extracted as arrays and used with familiar operations like 𝘮𝘢𝘱 and 𝘧𝘪𝘭𝘵𝘦𝘳📊

2 weeks ago 0 0 1 0

I just published a TypeScript library for loading CSV data, with an API inspired by Pandas and @pola.rs, but fully written in Rust 🦀

2 weeks ago 1 0 1 0

Our OSS engineer @cle-does-things.bsky.social recently built 𝗹𝗶𝘁𝗲𝘀𝗲𝗮𝗿𝗰𝗵, a fully local document ingestion and retrieval CLI/TUI application powered by LiteParse ⚡

litesearch demonstrates how developers can assemble a high-performance, local-first pipeline using tools from across the ecosystem:

3 weeks ago 4 1 2 0

GitHub - AstraBert/litesearch: Fully-local search engine with Liteparse, transformers.js and Qdrant Edge Fully-local search engine with Liteparse, transformers.js and Qdrant Edge - AstraBert/litesearch

Find all the code on GitHub: github.com/AstraBert/l...

3 weeks ago 1 0 0 0

- Store embeddings in a local @qdrant.bsky.social edge shard (custom-built in Rust and compiled as a native add-on🦀)
- Retrieve from stored files with (optional) path-based filtering and a relevance threshold
The app runs on @bun.sh, so make sure you have it installed🥞

3 weeks ago 0 0 1 0

- Parse your unstructured documents with LiteParse, the lightning fast parser that we just open sourced at @llamaindex.bsky.social
- Chunk with @chonkie.bsky.social
- Embed with a local model through @hf.co transformers.js

3 weeks ago 0 0 1 0

Hey there 👋 , I built 𝗹𝗶𝘁𝗲𝘀𝗲𝗮𝗿𝗰𝗵, a fully local document ingestion and retrieval CLI and TUI app, powered by LiteParse⚡

3 weeks ago 0 0 1 0

Posts by Clelia Astra Bertelli