for better or worse they are the only ones using hypertext correctly (links links links)
amusingly it's a great way of writing for LLMs specifically
Posts by you-have-just-experienced-things
well some institutional investors are only trading bitcoin via ETFs, but the ones transacting cryptocurrency directly do hire for night/weekend shifts
retail traders, I have no idea, I assume their sleep is not great
instead of repeating what other people said in the thread, an anecdote: a former hedge fund coworker did a short stint as a bitcoin day trader (after failing to raise money for an LNG venture, bad timing). he burned out in 3 weeks because he couldn't adjust to the 24/7 trading hours
oops I missed this but yes the opening auction is the answer here
don't really have room for it atm, we do have an exercise bike but it's not quite the same
and yeah the snow/ice is pretty prohibitive, and that's like November-March up here
I think I would enjoy running more if I could do it year round and didn't have to re-form the habit every April
so close
bsky.app/profile/alic...
not my subfield but soliciting DoD grants for formal methods/type theory work was a well known thing at least into the late 00s
well now that we have the phrase "crashing out" we can really condense the summary
I love the ones that sound like usenet flames
bsky.app/profile/bloc...
"there but for the grace of my brain go I"
Claude + line/branch coverage tool + mutation testing tool is a godsend for legacy codebases
I spent in the ballpark of 900M Opus tokens generating tests in the past week, where do I sign up to get blocked
I remember enjoying it, but that was 15 years ago, jeez
this but it's 5D Chess with Multiverse Time Travel
I am once again asking why we never talk about the useful time horizon for price signals. is anyone willing to bite this bullet and argue that insiders unloading minutes before markets resolve is good actually
cat caught in the act of using my bag as a scratching post
thank you for this, the "climbing towards nlu" paper always frustrated me because it presents a false consensus and doesn't engage with any of the relevant literature, e.g. it outlines the chinese room argument uncritically and with zero acknowledgement of the subsequent debate. it's maddening
"would you still love me if I were a gwern"
(discourse from 135,000 years ago)
sure, language is useful for:
1) fraud
2) plagiarism
3) cognitive off-loading
which of those use-cases are you promoting?
I'm not sure how to define what exactly it's measuring, but feels like there is some level of raw capabilities it needs to get itself unstuck consistently enough for a task like this
fwiw I have claude writing a qemu userspace emulator for [proprietary os] and it's been implementing syscalls for several days with no intervention, making steady progress. this would have been totally impossible before opus 4.5. benchmark is pointing at ~something real, I guess is my point
these tests do not have to be human-readable (although I'm surprised how often they are, even when the code under test is massive files of decades-old fortran)
strongly agree
another thing I'd add is requiring exhaustive characterization testing of new and existing code, and especially legacy code (incl coverage to ensure every branch is hit). this is the only way to safely enable the refactors surfaced by the "how would you do it from scratch" passes
I'll usually do a few passes with a fresh agent to check if each doc has anything underspecified before moving on to the next step
on the requirements and design yes and then if you do those thoroughly the implementation just pops out
this is distinct from plan mode in that the docs are larger and tightly scoped to their purpose (functional requirements vs design) and there's a lot of back-and-forth on specific line items. the acceptance criteria also need to be things it can run programmatically to iterate
for programming i'm all-in on the spec-driven workflow where you and the AI
- iterate on a requirements doc which includes detailed acceptance criteria
- clear context, iterate on a design doc
- clear context, implement the thing until the criteria are passing
- clear context, review
in this house we DISAGREE with Suicide LLM
and sometimes also mandarin