Advertisement · 728 × 90

Posts by

can I code fast? no. but can I code well? also no. but does my code work? alas, no

1 year ago 18296 2131 408 152
Post image

From Szeliski's "Computer Vision – Algorithms and Applications."

1 year ago 26 2 3 0

That account is doing AMA about research at deepmind 🤣

1 year ago 0 0 0 0

Trying to build a "books you must read" list for my lab that everyone gets when they enter. Right now its:

- Sutton and Barto
- The Structure of Scientific Revolutions
- Strunk and White
- Maybe "Prediction, Learning, and Games", TBD

Kinda curious what's missing in an RL / science curriculum

1 year ago 141 11 36 1
Post image Post image Post image

Want to learn / teach RL? 

Check out new book draft:
Reinforcement Learning - Foundations
sites.google.com/view/rlfound...
W/ Shie Mannor & Yishay Mansour
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.

1 year ago 154 35 4 4

Big fan of this essay by @abeba.bsky.social

1 year ago 25 3 1 0

+100 to this recommendation

1 year ago 31 2 2 0
Preview
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning These notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning. They present a theory (devel...

Nadav Cohen and I recently uploaded lecture notes on the theory (and surprising practical applications) of linear neural networks.

Hope that it can be useful, especially to those entering the field as it highlights distinctions between DL and "classical" ML theory

arxiv.org/abs/2408.13767

1 year ago 3 1 0 0
Advertisement

Your blog makes up for it... Equally balanced 😄😄

1 year ago 1 0 1 0

Games are reasonable stepping stones as testbeds for AI progress. NetHack and text adventure games hit on modern AI weaknesses. I literally just gave a talk on why we should take Dungeons and Dragons and other role-playing games seriously as AI challenges.

1 year ago 71 11 7 1
Book outline

Book outline

Over the past decade, embeddings — numerical representations of
machine learning features used as input to deep learning models — have
become a foundational data structure in industrial machine learning
systems. TF-IDF, PCA, and one-hot encoding have always been key tools
in machine learning systems as ways to compress and make sense of
large amounts of textual data. However, traditional approaches were
limited in the amount of context they could reason about with increasing
amounts of data. As the volume, velocity, and variety of data captured
by modern applications has exploded, creating approaches specifically
tailored to scale has become increasingly important.
Google’s Word2Vec paper made an important step in moving from
simple statistical representations to semantic meaning of words. The
subsequent rise of the Transformer architecture and transfer learning, as
well as the latest surge in generative methods has enabled the growth
of embeddings as a foundational machine learning data structure. This
survey paper aims to provide a deep dive into what embeddings are,
their history, and usage patterns in industry.

Over the past decade, embeddings — numerical representations of machine learning features used as input to deep learning models — have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important. Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.

Cover image

Cover image

Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🤩

Let's start with "What are embeddings" by @vickiboykis.com

The book is a great summary of embeddings, from history to modern approaches.

The best part: it's free.

Link: vickiboykis.com/what_are_emb...

1 year ago 651 101 22 6

Cohere's studies have shown that LLMs tend to rely on documents that contain procedural knowledge, such as code or mathematical formulas, when performing reasoning tasks.

This suggests that LLMs learn to reason by synthesizing procedural knowledge from examples of similar reasoning processes.

1 year ago 39 3 1 0

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

1 year ago 553 212 66 55

Is there any evidence for "pure" memory of space and time? All spatial and temporal memory tasks I can think of query memory of particular events/objects embedded in space and time. I conjecture that it's impossible for us to recall a location or time in the absence of events or objects.

1 year ago 52 8 10 1
Post image

The Llama 3.2 1B and 3B models are my favorite LLMs -- small but very capable.
If you want to understand how the architectures look like under the hood, I implemented them from scratch (one of the best ways to learn): github.com/rasbt/LLMs-f...

1 year ago 141 16 7 1