Steven Fortney (@steven-fortney) Bsky

Okay yeah that’s actually exactly what I meant by contextual haha. Got it though. Cool stuff!

1 year ago 3 0 1 0

Oh dang that’s a cool trick. Directly constraining the logprobs makes a lot of sense. I get how it works for a true/false binary (set all other tokens = 0) but any idea how it works for json? Is it a dynamic or contextual constraint?

1 year ago 1 0 1 0

What do you mean by structured gen? Finetune on text/json input/output pairs?

1 year ago 1 0 1 0

ReAG - Reasoning Augmented Generation

- No chunking, splitting vectorizing bs
- Stateless, no vector DBs etc.
- Supports any model (deepseek, o3-mini et al)
- Reasoning traces
- Metadata filtering
- Typescript, Python support

1 year ago 25 4 2 1

DeepSeek is part of a quant trading firm, which probably operates out of the fanciest office imaginable, but why am I picturing this?

1 year ago 40 2 3 0

Apparently, you can run DeepSeek-V3 locally, provided that you have 8 M4 Pro 64GB Mac minis.

~5 tok/sec.

1 year ago 80 11 2 2

I haven’t seen o3 yet & have been critical of benchmarks for AI but they did test against some of the hardest & best

On GPQA, PhDs with access to the internet got 34% outside their specialty, up to 81% inside. o3 is 87%.

Frontier Math went from the best AI at 2% to 25%

Some other big ones, too

1 year ago 112 16 6 0

An Alternative to Test-Time Scaling Exploring conditional computation and dynamic depth in language models. Contents Conditional Computation Width vs Depth O1 / Test-Time Scaling SMoE Dropout MoEUT Next Steps Conditional Computation I...

An Alternative to Test-Time Scaling by @kalomaze.bsky.social

Exploring conditional computation and dynamic depth in language models.

rentry.org/conditional_...

1 year ago 3 1 0 0

Genesis project

A generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications.

1 year ago 37 11 3 4

Introducing MASt3R-SLAM, the first real-time monocular dense SLAM with MASt3R as a foundation.

Easy to use like DUSt3R/MASt3R, from an uncalibrated RGB video it recovers accurate, globally consistent poses & a dense map.

With @ericdexheimer.bsky.social* @ajdavison.bsky.social (*Equal Contribution)

1 year ago 99 16 5 6

They hypothesize that there exist key "forking tokens," such that re-sampling the system at those specific tokens, but not others, leads to very different outcomes.

An example would be that a simple punctuation mark, or just a single token, can prompt an LLM to produce a different response.

1 year ago 23 4 2 1

Meta's SPDL: Faster AI model training with thread-based data loading. This framework-agnostic data loading solution utilizes multi-threading to achieve high-throughput in a regulator Python interpreter.

Blog: ai.meta.com/blog/spdl-fa...
Repo: github.com/facebookrese...

1 year ago 11 2 0 0

Building Machine Learning Systems for a Trillion Trillion Floating Point Operations YouTube video by Jane Street

Jane Street, a quant trading firm has a very good YouTube channel. For comparison, DeepSeek is also a quant trading firm.

They recently published a video on "Building Machine Learning Systems for a Trillion Trillion Floating Point Operations".

Link: www.youtube.com/watch?v=139U...

1 year ago 36 7 1 0

How are Kernel Smoothing in statistics, Data-Adaptive Filters in image processing, and Attention in Machine Learning related?

My goal is not to argue who should get credit for what, but to show a progression of closely related ideas over time and across neighboring fields.

1/n

1 year ago 113 21 4 2

Real footage of a synthetic control model

1 year ago 646 118 14 9

Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.

1 year ago 234 61 15 30

America has three functional high capacity institutions left

The Federal Reserve
The Southern District of New York
and The Delaware Court of Chancery

1 year ago 546 47 10 1

The Mystery of Food Deserts They didn’t materialize around the country for no reason. Something happened.

1. The conventional explanation for food deserts—that these places are too poor or too rural to generate enough spending on groceries, or too Black to overcome racist corporate redlining — fail to grapple with a key fact: food deserts didn’t used to exist. My new piece in The Atlantic.

1 year ago 5587 2494 196 577

Bump

1 year ago 24 1 0 0

True! I was more trying to point out that ranking on engagement gets you an (essentially) controversy-weighted “popularity” list.

For outbound clicks, choosing links from a whitelist of reputable sources can help the click bait problem but this is definitely not a complete solution.

1 year ago 2 0 0 0

Absolutely. Part of the best thing about Twitter in the old days was that it felt like one of the few places you could see truly breaking news.

1 year ago 1 0 0 0

measure of popularity than comment counts or even likes.

1 year ago 6 0 0 0

No concrete answers but I encourage you to think about the second-order effects of your algo. Eg if popularity is a function of engagement and engagement is a function of controversy then your algo at least partially rewards controversy.

More ‘silent’ measures (outbound clicks) might be a better

1 year ago 6 0 3 0

My idea for Econ seminars: speakers can go for as long as they want and talk about whatever they want. But we change the norm so that the audience can leave whenever they want and it’s nbd. Let supply and demand for attention determine seminar length/structure, etc.

1 year ago 24 1 4 2

Being logged into wandb on your phone is a recipe for misery

1 year ago 74 4 9 0

🌶️(?) take: Agents are somehow hot right because people realized that LLM output can be interpreted as a DSL which directs side effects in the world (e.g. tool calls) rather than just returning text in a chat/autocomplete sense. What are the open challenges? A 🧵... [1/11]

1 year ago 165 31 9 7

Posts by Steven Fortney