Advertisement · 728 × 90

Posts by Roland Huß

omg, never could imagine that I was ever scared about something like "auto compaction". It always feels like speed dating (never did it but that how I imagine it :)

4 months ago 0 0 0 0

I've felt in love with Claude Code's CLI UI. Super well done, and worth a try, even when you're not in the vibe-coding business.

9 months ago 0 0 0 0

I totally agree that this kind of 'reframing' of given concepts is dangerous, and I apologize for that 'no one writes em-dashes manually' 🙇

— is the correct form, but harder to type. Maybe that kind of semi-correct sloppiness (to use - instead) makes us human?

1 year ago 0 0 1 0

Of course, I already have you on my exception list :-)

Agreed, not every em dash is a decisive flag, but in combination with other indicators, it's an interesting data point that is often overlooked.

1 year ago 1 0 1 0

The easiest way to detect LLM-generated output: The usage of "—" (U+2014 : EM DASH) instead of "-" (U+002D : HYPHEN-MINUS). Nobody types an em dash manually, but ChatGPT at least LOVES it.

Corollary: Always check your ChatGPT output for hyphens 😜

1 year ago 0 0 2 0
Chili Pepper Seeds soaked 24h in chamilea tea for improved germination rates

Chili Pepper Seeds soaked 24h in chamilea tea for improved germination rates

Putting 190 seeds into mini greenhouses

Putting 190 seeds into mini greenhouses

Three mini greenhouses for growing chilli pepper

Three mini greenhouses for growing chilli pepper

After one year break, we're back for the chilli pepper season 2025. Let's g(r)o(w) ! #chili

1 year ago 3 0 0 0
Christine Lemmer-Webber (@cwebber@social.coop) How Decentralized Is Bluesky Really? https://dustycloud.org/blog/how-decentralized-is-bluesky/ A technical deep-dive, since people have been asking me for my thoughts. I'll expand a bit on some of th...

Great and imo quite balanced technical view on BlueSky's decentralization features. It's worth a read, even when it's quite a mouthful --> social.coop/@cwebber/113...

1 year ago 2 0 1 0
Book outline

Book outline

Over the past decade, embeddings — numerical representations of
machine learning features used as input to deep learning models — have
become a foundational data structure in industrial machine learning
systems. TF-IDF, PCA, and one-hot encoding have always been key tools
in machine learning systems as ways to compress and make sense of
large amounts of textual data. However, traditional approaches were
limited in the amount of context they could reason about with increasing
amounts of data. As the volume, velocity, and variety of data captured
by modern applications has exploded, creating approaches specifically
tailored to scale has become increasingly important.
Google’s Word2Vec paper made an important step in moving from
simple statistical representations to semantic meaning of words. The
subsequent rise of the Transformer architecture and transfer learning, as
well as the latest surge in generative methods has enabled the growth
of embeddings as a foundational machine learning data structure. This
survey paper aims to provide a deep dive into what embeddings are,
their history, and usage patterns in industry.

Over the past decade, embeddings — numerical representations of machine learning features used as input to deep learning models — have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important. Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.

Cover image

Cover image

Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🤩

Let's start with "What are embeddings" by @vickiboykis.com

The book is a great summary of embeddings, from history to modern approaches.

The best part: it's free.

Link: vickiboykis.com/what_are_emb...

1 year ago 651 101 22 6

RedHat on Bluesky go.bsky.app/Du6L1Ec

1 year ago 1 3 0 0