winnie the pooh meme with "Spending [xx] years doing a PhD to talk to chatbots" in the top panel, and "Spending [xx] years doing a PhD to probe logits" in the bottom panel
putting together an argument for why scientists & engineers should care about base LLMs, not just the post-trained chatbot kind
12 hours ago
21
3
2
0
for those who celebrate the greatest piece of sports journalism ever made
1 day ago
43
10
4
1
sure, but if the creative reframing was authored by mythos then it still counts!
1 day ago
0
0
0
0
If, like me, you wake up every morning wondering what the VoronoΓ― diagram of arbitrary shapes looks like, here is the answer:
2 days ago
31
4
1
0
Statistics for LLM Evals
A research-backed guide to statistical methods for LLM and AI model evaluations. Learn to compare models, prompts, and agents with confidence intervals, bootstrap methods, and hypothesis tests.
Stats for Evals is now live, and we got a site, too: statsforevals.com We'll be posting regular investigations across the summer. For now, we're starting with the basics: comparing models and prompts. Also has resources, principles, example code, and guidance for others:
2 days ago
19
2
2
0
very cool! can you click on the tiers to jump to the beginning of that level or nesting?
3 days ago
0
0
1
0
Advertisement
text editor showing a C# file where next to the vertical scrollbar there's a visual depiction of the methods and statements inside methods represented as nested colorful blocks. There's also a Document outline tool window showing the tree of types and members in the file, and method bodies are shown under methods using the same structural representation.
I returned to my structured editor roots a little bit and prototyped a scrollbar that shows the statement structure in method bodies.
Also a document outline tree where statement structure is visible for larger methods. This way you can visually distinguish methods by their "shape".
3 days ago
24
6
1
0
i guess based on (mostly anecdotal) evidence that test-driven dev processes produce more robust results (to 1st degree: forces externalization of inductive priors, exposing over/underdetermination). but open to the idea that such practices are cargo culting in the llm case. or in the human case tbh
4 days ago
3
1
1
0
tests can be both (part of) a specification at the external interface and a mechanism for internal specification of subcomponents. in a perfect world at least the latter are none of my business
4 days ago
4
0
1
0
the key to creative coding: numbers you can scrub through?
5 days ago
28
4
4
3
I have immense respect for the Bluesky staff because they have a great vision for what the future of social media and software development should look like and have to put up with so much shit from the userbase for it
5 days ago
185
16
1
0
I wanted to see what this would physically look like on the desk so I made a quick mock-up in Blender
6 days ago
5238
1577
81
46
goin to the sgai workshop by chance?
6 days ago
1
0
1
0
Advertisement
Incredible footage from a United Airlines flight of the Artemis II launch.
6 days ago
4483
1621
35
141
When you imagine the ideal AI keyboard, the UIs in Star Trek start to make a lot more sense
1 week ago
24
1
0
0
this place continues to fascinate me
1 week ago
12
1
2
0
yeah i'm working on AI right now (APL Integration)
1 week ago
20
2
1
1
hope someone responded with the relevant babbage quote
1 week ago
1
0
0
0
be not afraid of this 5-level nested combination
biblically accurate combinatorics
1 week ago
83
14
5
2
Very amusing vowel space playground
editor.p5js.org/mimoi/full/x...
1 week ago
80
26
6
2
all extant intelligences have constrained perception (visual acuity / attention) and short-term memory (chunks / context window). PL design is essential for developing notations and abstractions which maximize the effectiveness of both
1 week ago
12
1
1
0
i eagerly await your abstract interpretation library and have added three dependents on it to the kanban board
1 week ago
3
0
1
0
Advertisement
earring --dangerously-skip-first-suggestion
1 week ago
35
8
2
0
jj looking like he is gonna bicameralize that mind with his own two hands
1 week ago
28
2
2
0
genie model. u get 3 prompts
1 week ago
4
0
0
0
my top generic ones are 'For You' and 'Quiet Posters', which are basically mutually exclusive. then a long tail of domain specific ones
2 weeks ago
2
0
0
0
my bluesky experience is so much better when i frequently change feeds. but i always forget to do so. luckily on mobile it is very easy to do this accidentally
2 weeks ago
22
1
3
0
accidental catgirls as modern sorry i made cube
2 weeks ago
1
0
0
0