Advertisement · 728 × 90

Posts by Leo Boytsov

Preview
The First Evolution of Vibe Coding: Engineering Leadership Report 1. Introduction: The Post-Vibe Era "Vibe Coding"—the practice of prioritizing natural...

"Our mandate is clear: Engineering organizations must move from "accepting the vibes" to "verifying the specs." High-scale reliability requires that we treat AI as a generator of proposals, while maintaining human expertise as the final arbiter of situated judgment and architectural integrity." 🟦

1 month ago 1 0 0 0

🧵"LLMs are currently "sophisticated autocomplete" tools, not autonomous engineering partners. The 31.7% failure rate and the 13.5x dependency expansion represent a "hidden tax" that can quickly negate any velocity gains." ↩️

1 month ago 1 1 1 0

6. Lightweight API
Many real workloads don’t need a full distributed system — just a clean, streaming, parallel map loop.
A detailed discussion in the blog post: 🟦https://lnkd.in/ehu8HcBA

1 month ago 1 0 0 0

5. Boilerplate and cognitive overhead
Great libraries like Dask and Ray are powerful, but for a simple map pattern they introduce clusters, schedulers, plugins, actors, futures, and daemons (see examples in the end of the post).↩️

1 month ago 1 0 1 0

4. Notebook compatibility
Some parallel tools (e.g., Ray actors, certain ways of using multiprocessing) behave unpredictably in Jupyter/IPython environments.↩️

1 month ago 0 0 1 0

3. Size-aware streaming for progress tracking for limited-size input/output
Tools that process fixed-size input arrays often fail to expose an output with a usable len(...). Without a known length, progress bars like tqdm can’t estimate remaining time.↩️

1 month ago 1 0 1 0

2. True streaming input/output
Many frameworks materialize outputs in bulk rather than yielding results as soon as they are ready. This breaks bounded-memory pipelines.↩️

1 month ago 0 0 1 0

Common Pain Points w. Existing Tools. Even for simple parallel loops many libraries fall short
1. Stateful workers with flexible initialization. Most map APIs assume stateless functions. When workers must load models or context before processing, solutions become awkward or require elaborate hacks.↩️

1 month ago 0 0 1 0
Advertisement

In 2024, I finally distilled the pattern into a tiny library: mtasklite.
Below is the motivation, the flaws of existing approaches, and what makes mtasklite unique.↩️

1 month ago 1 0 1 0

Much later did I explore tools like Dask and Ray (see examples in the end of the post) and found that, for this specific pattern, they still required a lot of complexity and infrastructure for something that should be simple: ↩️

1 month ago 0 0 1 0

and I later explored pqdm (parallel TQDM library) to simplify the code (as well as to visualize progress on long-running loops) when workers were stateless. Those solutions did work but always felt incomplete and/or required repeated boilerplate.↩️

1 month ago 0 0 1 0

🧵For the last seven years, I kept re-implementing the same pattern: A parallel map loop that divides the work among several processes or threads. My very first attempts were built on Python’s standard tools, e.g., multiprocessing.map... ↩️

1 month ago 2 1 1 0
Post image

@srchvrs.bsky.social: "This is too little for pre-training, but pre-training nowadays is probably not a bottleneck. For post-training 16M training samples can meaningfully improve performance on a lot of tasks."

1 month ago 2 1 1 0

It's a yet another example of how rich gets richer. Seasoned devs benefit from LLM assisted coding but their skills have developed already. So they get the best of both worlds.

2 months ago 2 0 0 0

"Meta is cutting around 600 positions out of the several thousand roles in its Superintelligence Labs, the Facebook owner said on Wednesday as it looks to make its artificial intelligence unit more flexible and responsive."
www.reuters.com/business/met...

5 months ago 3 1 0 0

Paper presents fascinating (as described by AI 😆) case study on how seemingly innocuous modification to neural net can drastically alter its perceived robustness against gradient-based adversarial attacks.
searchivarius.org/blog/curious...

7 months ago 0 0 0 0
Advertisement

🧵I recently finished my nerdiest computer science paper so far and it was accepted by TMLR: A Curious Case of Remarkable Resilience to Gradient Attacks via Fully Convolutional and Differentiable Front End with a Skip Connection. This work was done while I was at Bosch. ↩️

7 months ago 0 0 1 0

PS: Yes, this is a frontier LLM and it still cannot fully replace an editor.⏹️

8 months ago 0 0 0 0

4. Model complains about its own suggestion.
5. Bonus point: of course, often times the complaints are incorrect. If you further poke the model it will likely accept being wrong. Which, in turn, may not mean much because models are also clearly trained to agree with humans as much as possible.
↩️

8 months ago 0 0 1 0

🧵Hot take: LLMs still fail at basic grammar/style checking. A repeating situation that I encounter:
1. Ask a model about an issue.
2. The model suggests some rewrite for clarity/accuracy. Typically it's actually quite good (but watch for factual errors!).
3. Recheck the text again.
↩️

8 months ago 1 0 1 0
Preview
A Large-Scale Study of Reranker Relevance Feedback at Inference | Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval

Results? If read tables correctly, there's only very modest boost in both recall & NDCG, which is within 2%. Given that the procedure requires a second retrieval, it does not seem to worth an effort.
🟦
dl.acm.org/doi/abs/10.1...

9 months ago 0 0 0 0

PRF was not forgotten in the neural IR times, but how does it perform really? Revanth Gangi Reddy & colleagues ran a rather thorough experiment and published it SIGIR.
↩️

9 months ago 0 0 1 0
Statistical source expansion for question answering for CIKM 2011 Statistical source expansion for question answering for CIKM 2011 by Nico Schlaefer et al.

It was doc2query before doc2query and, in fact, it improved performance (by a few%) of the IBM Watson QA system that beat human champions in Jeopardy!
↩️
research.ibm.com/publications...

9 months ago 0 0 1 0

I think this is a problem of completely unsupervised and blind approach of adding terms to the query. If we had some supervision signal to filter out potentially bad terms, this would work out better. In fact, a supervised approach was previously used to add terms to documents!
↩️

9 months ago 0 0 1 0
Advertisement

Fixing this issue produced a sub-topic in the IR community devoted to fixing this issue and identifying cases where performance degrades substantially in advance. Dozens of approaches were proposed, but I do not think it was successful. Why⁉️
↩️

9 months ago 0 0 1 0

PRF tends to improve things on average, but has a rather nasty property of tanking outcomes for some queries rather dramatically: When things go wrong (i.e., unlucky unrelated terms are added to the query), they can go very wrong. ↩️

9 months ago 0 0 1 0
Preview
Leo Boytsov on X: "🧵40 years ago the SMART IR system was released. It introduced a few key concepts including vector space interpretation of the retrieval process and the relevance feedback algorithm. I also think it was probably the first open source search engine. ↩️" / X 🧵40 years ago the SMART IR system was released. It introduced a few key concepts including vector space interpretation of the retrieval process and the relevance feedback algorithm. I also think it was probably the first open source search engine. ↩️

PRF is an old technique introduced 40 years ago in the SMART system (arguably the first open-source IR system). ↩️
x.com/srchvrs/stat...

9 months ago 0 0 1 0

🧵Pseudo-relevance feedback (PRF) (also known as blind feedback) is a technique of first retrieving/re-ranking top-k documents and adding some of their words to the initial query. Then, a second retrieval/ranking stage uses an updated query. ↩️

9 months ago 1 1 1 0

If you submitted a messy paper, it's pointless to address every little comment and promise fixing it in the final version. 🟦

9 months ago 0 0 0 0

Instead, think hard about questions you can ask. What is the main misunderstanding? What will you have to do so that a reviewer will accept your work next time. Which concise questions can you ask to avoid misunderstanding in the future? ↩️

9 months ago 0 0 1 0