Advertisement ยท 728 ร— 90

Posts by Thomas Ahle

The "kombucha girl" meme, where the left panel (with her frowning) stating "According to the proposition" and the right one (showing her interested face) stating "Lemma tell you"

The "kombucha girl" meme, where the left panel (with her frowning) stating "According to the proposition" and the right one (showing her interested face) stating "Lemma tell you"

Mathematical writing is my passion.

3 months ago 53 6 0 0
Post image

like this?

5 months ago 0 0 1 0

Added a new symbols menu - let me know if I missed any of your favourite LaTeX commands!

5 months ago 4 0 2 0

I can't tell how much interest there is. But messages like this definitely encourage me to continue it!

11 months ago 1 0 0 0
Preview
LaTeX to Image Effortlessly convert LaTeX math equations into high-quality images (PNG, JPEG, SVG).

I needed an easy way to make high resolution equations to post on Bluesky, so I made this: thomasahle.com/latex2png

1 year ago 4 0 0 1

> If NATO hadn't been trying to expand there, there would have been no war.

There would.

> If NATO stops trying to expand into Ukraine, the war ends.

It wouldn't.

> If the US stops sending weapons and fomenting anti-Russian sentiment, the war ends.

This war is about territory not sentiment.

1 year ago 3 0 0 0
Tensorgrad Playground The Tensor Cookbook is a comprehensive guide to tensors, using the visual language of tensor diagrams. It closely follows the legendary 'Matrix Cookbook' while pr...

You can play around with expectations of higher order Gaussians using the new
tensorcookbook.com/playground

1 year ago 1 0 0 0
Post image

Isserlis' (or Wick's) theorem is one of the strongest tools to handle High Dimensional Gaussians.

Turns out it generalizes to _every distribution_ using cumulant tensors!

That's higher order variance, skewness, kurtosis, etc.

1 year ago 1 0 1 0
Post image

I added a Playground to tensorcookbook.com for when you need that Matrix or Tensor Derivative in a hurry.

Hopefully it can also be a way to help people become familiar with tensor diagrams.

1 year ago 1 0 0 0
Advertisement

Now we're just waiting for a ZkiT model

1 year ago 1 0 0 0

Now live in a new Functions chapter in tensorcookbook.com

1 year ago 0 0 0 0
Post image

Some sketches for the next chapter

1 year ago 0 0 0 1
Post image

I added code execution to tensorcookbook.com so you can try tensorgrad's automatic tensor algebra without installing anything.

1 year ago 0 0 0 0
Preview
Professor Rasmus Pagh Receives International Recognition as ACM Fellow Professor Rasmus Pagh has been named a 2024 Fellow of the Association for Computing Machinery (ACM), a prestigious honor awarded to leading researchers in computing. This recognition highlights his si...

๐ŸŽ‰ Congratulations to Rasmus Pagh @rasmuspagh.net, the inventor of Cuckoo Hashing, and my PhD advisor, for becoming an ACM fellow! ๐ŸŽ‰
di.ku.dk/english/news...

1 year ago 4 0 0 0
Post image

Tensor Product Attention illustrated with Tensor Diagrams

1 year ago 5 0 1 0
Post image

Neat one-page proof of "Stirling's bound"

(n/e)โฟโˆš{2ฯ€ n} โ‰ค n! โ‰ค (n/e)โฟ(โˆš{2ฯ€ n}+1)

Inspired by the discussion on mathoverflow.net/a/458011/5429. Just had to keep hitting it with logarithmic inequalities...

1 year ago 1 0 0 0
Advertisement

Yes please!

1 year ago 0 0 0 0

Poisson Probability Puzzle:

Let X ~ Poisson(๐œ‡); Z = (X - ๐œ‡)/โˆš๐œ‡; Y ~ Normal(0, 1).
How close is E[|X|^k] is to E[|Y|^k]?

Say we connect ๐œ‡ and k by ๐œ‡ = c kยณ, what is now the limit E[|X|^k]/E[|Y|^k] as k โ†’ โˆž?

This was harder to solve than expected, but the answer was surprisingly pretty ๐ŸŒป

1 year ago 2 0 1 0
Video

"Central Limit Theorem" for the Poisson Distribution

1 year ago 1 0 0 1
Thomas Dybdahl Ahle Thomas Dybdahl Ahle is a researcher in the theoretical foundations of machine learning and massive data, including similarity search, high dimensional geometry, kernel methods, sketching and derandomi...

A while ago Twitter removed the option of embedding your timeline on your website. Luckily, with Bluesky, I'm now able to put it back on thomasahle.com. Good to be back.

1 year ago 3 0 0 0

Can you refer me to the openai forum?

1 year ago 0 0 1 0
History Heuristic - Chessprogramming wiki

For more information on history heuristics in chess, see www.chessprogramming.org/History_Heur...

1 year ago 0 0 0 0
History Heuristic - Chessprogramming wiki

near future.

Time will tell if they'll update the entire network, or a smaller LoRA or side network.

Even chatbots like o1 could use TTT as an alternative to in context learning.

5/5

1 year ago 1 0 1 0

while searching. If two subtrees are conceptually similar, it has to do all the work twice.

Test Time Training fixes this!
If AlphaZero updated its weights while searching, it could transfer learnings between the subtrees!

I'm sure we'll start seeing a lot of TTT architectures in the near...

4/5

1 year ago 0 0 1 0

Obviously having a pretrained cnt[from][to] array wouldn't be helpful at all in chess, as moves may be good or bad entirely dependent on the position.

But because the butterfly table is reset at every search, it encodes "local information".

AlphaZero meanwhile doesn't learn anything while...

3/5

1 year ago 0 0 1 0
Post image

Chess engines like Stockfish will keep a so-called butterfly board, keeping track of how often a move was chosen in the search tree. _Independently of the position_.

This is data is considered elsewhere in the search tree to decide how much time to spend considering the move.

Why do this?

2/5

1 year ago 0 0 1 0
Advertisement

Test Time Training promises to finally unify learning and search. As always, chess is a good place to study such ideas:

AlphaZero generalized and simplified most of the tricks in chess engines like Stockfish, but one category is missing: history heuristics...

1/5

1 year ago 2 0 1 0

Your o1 supports images?

1 year ago 1 0 1 0

Making a wiki style website is a good way to do this, while encouraging others from. The community to contribute and keep it updated.

In fact, writing good Wikipedia articles for your field might be the best way to spread this knowledge.

1 year ago 1 0 0 0
Post image

Clever use of the KV-cache: Writing in the margins (arxiv.org/abs/2408.14906) at Neurips next week.

By "taking notes" as you read, ypu reduce the complexity from N^3 (N tokens at N^2 cost) to N^3/3 (1+4+9+...+N^2).

1 year ago 1 0 0 0