Jack Hessel (@jmhessel) Bsky

"The effect of platform on message propagation..." 🙏

6 months ago 1 0 0 0

"more work"

7 months ago 2 0 0 0

A photograph of mount rainier I took from reflection lake.

life update: a few weeks ago, I made the difficult decision to move on from Samaya AI. Thank you to my collaborators for an exciting 2 years!! ❤️ Starting next month, I'll be joining Anthropic. Excited for a new adventure! 🦾

(I'm still based in Seattle 🏔️🌲🏕️; but in SF regularly)

8 months ago 22 0 4 0

It is a major policy failure that the US cannot accommodate top AI conferences due to visa issues.
buff.ly/DRJOGrB

9 months ago 84 21 4 2

bring back 8 page neurips papers

9 months ago 1 0 0 0

m̶e̶n̶ Americans will literally l̶e̶a̶r̶n̶ ̶e̶v̶e̶r̶y̶t̶h̶i̶n̶g̶ ̶a̶b̶o̶u̶t̶ ̶a̶n̶c̶i̶e̶n̶t̶ ̶R̶o̶m̶e̶ invest billions into self driving cars instead of g̶o̶i̶n̶g̶ ̶t̶o̶ ̶t̶h̶e̶r̶a̶p̶y̶ building transit

10 months ago 7 0 0 0

Wulti wodal wodels

10 months ago 8 0 2 0

bring back length limits for author responses

10 months ago 9 0 1 0

What makes multi-agent LLM systems multi-agent? What makes multi-agent LLM systems multi-agent? GitHub Gist: instantly share code, notes, and snippets.

in llm-land, what is a tool, a function, an agent, and (most elusive of all): a "multi-agent system"? (This had been bothering me recently; are all these the same?)

@yoavgo.bsky.social's blog is a clarifying read on the topic -- I plan to adopt his terminology :-)

gist.github.com/yoavg/9142e5...

10 months ago 3 0 0 0

👍

1 year ago 1 0 0 0

If you're in WA and think imposing new taxes on things we want more of (e.g., bikes, transit) is a bad idea, consider contacting your reps using this simple form! <3

1 year ago 5 0 0 0

I pinged editors@ about it, they are working on it

1 year ago 1 0 0 0

Songlin Yang A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design.

Should you delete softmax from your attention layers? check out Songling Yang's (sustcsonglin.github.io) tutorial, moderated by @srushnlp.bsky.social, for a beginner-friendly tutorial of the why/how/beauty of linear attention :-) www.youtube.com/watch?v=d0HJ...

1 year ago 4 1 0 0

Journal of Educational Data Mining Large language models (LLMs) are flexible, personalizable, and available, which makes their use within Intelligent Tutoring Systems (ITSs) appealing. However, their flexibility creates risks: inaccura...

I've spent the last two years trying to understand how LLMs might improve middle-school math education. I just published an article in the Journal of Educational Data Mining describing some of that work: "Designing Safe and Relevant Generative Chats for Math Learning in Intelligent Tutoring Systems"

1 year ago 7 2 0 0

How has DeepSeek improved the Transformer architecture? This Gradient Updates issue goes over the major changes that went into DeepSeek’s most recent model.

Very good (technical) explainer answering "How has DeepSeek improved the Transformer architecture?". Aimed at readers already familiar with Transformers.

epoch.ai/gradient-upd...

1 year ago 279 64 6 5

...... I can't decide if this is better or worse than growing alfalfa in the desert

1 year ago 3 0 1 0

Finding linguistic structure in large language models YouTube video by Chris Potts

I've posted the practice run of my LSA keynote. My core claim is that LLMs can be useful tools for doing close linguistic analysis. I illustrate with a detailed case study, drawing on corpus evidence, targeted syntactic evaluations, and causal intervention-based analyses: youtu.be/DBorepHuKDM

1 year ago 74 20 1 3

The GPT-4 barrier was comprehensively broken Some of those GPT-4 models run on my laptop LLM prices crashed, thanks to competition and increased efficiency Multimodal vision is common, audio and video are starting to emerge Voice and live camera mode are science fiction come to life Prompt driven app generation is a commodity already Universal access to the best models lasted for just a few short months “Agents” still haven’t really happened yet Evals really matter Apple Intelligence is bad, Apple’s MLX library is excellent The rise of inference-scaling “reasoning” models Was the best currently available LLM trained in China for less than $6m? The environmental impact got better The environmental impact got much, much worse The year of slop Synthetic training data works great LLMs somehow got even harder to use Knowledge is incredibly unevenly distributed LLMs need better criticism Everything tagged “llms” on my blog in 2024

Here's my end-of-year review of things we learned out about LLMs in 2024 - we learned a LOT of things simonwillison.net/2024/Dec/31/...

Table of contents:

1 year ago 651 149 28 46

It's ready! 💫

A new blog post in which I list of all the tools and apps I've been using for work, plus all my opinions about them.

maria-antoniak.github.io/2024/12/30/o...

Featuring @kagi.com, @warp.dev, @paperpile.bsky.social, @are.na, Fantastical, @obsidian.md, Claude, and more.

1 year ago 215 25 36 4

Did OpenAI Just Solve Abstract Reasoning? OpenAI’s o3 model aces the "Abstraction and Reasoning Corpus" — but what does it mean?

Some of my thoughts on OpenAI's o3 and the ARC-AGI benchmark

aiguide.substack.com/p/did-openai...

1 year ago 339 99 16 26

Sample and verify go brr

1 year ago 6 0 0 0

Check out our new encoder model, ModernBERT! 🤖

Super grateful to have been part of such an awesome team effort and very excited about the gains for retrieval/RAG! 🚀

1 year ago 17 2 1 0

I'm not an """ AGI """ person or anything, but, I do think process reward model RL/scaling inference compute is quite promising for problems with easily verified solutions like (some) math/coding/ARC problems.

1 year ago 4 0 0 0

Announcement #1: our call for papers is up! 🎉
colmweb.org/cfp.html
And excited to announce the COLM 2025 program chairs @yoavartzi.com @eunsol.bsky.social @ranjaykrishna.bsky.social and @adtraghunathan.bsky.social

1 year ago 66 24 0 1

Imo, the reason you don't see more of this is because 1) it's very hard to set up objective, interesting, fair, non-game-able, meaningful, expert-level evals and 2) the incentive for doing this type of careful dataset/environment curation work is not as high as it should be.

1 year ago 3 0 0 0

A picture of a transit sign with 4 minute frequencies

Meanwhile in my neighborhood in Seattle we've been fighting 5 years for (1) bus lane and 30 years for a (1) mile bike path

1 year ago 15 0 0 0

excited to come to #neurips2024 workshops this weekend --- I'll be around sat/sun to say hi to folks :-)

1 year ago 9 0 0 0

🚨 I’m on the academic job market!
j-min.io

I work on ✨Multimodal AI✨, advancing reasoning in understanding & generation by:
1⃣ Making it scalable
2⃣ Making it faithful
3⃣ Evaluating + refining it

Completing my PhD at UNC (w/ @mohitbansal.bsky.social).
Happy to connect (will be at #NeurIPS2024)!

👇🧵

1 year ago 30 10 2 6

“They said it could not be done”. We’re releasing Pleias 1.0, the first suite of models trained on open data (either permissibly licensed or uncopyrighted): Pleias-3b, Pleias-1b and Pleias-350m, all based on the two trillion tokens set from Common Corpus.

1 year ago 248 85 11 19

1 year ago 31 3 4 0

Posts by Jack Hessel