Also, attachments dont work π
Posts by
new blog out with first impressions on codex app. gist is there's potential, but coming from the cli you'll need to relearn some things
jkerschner.com/posts/codex-...
wtaf..
Have you tried bringing it a cup of coffee with your prompt?
the answer was not quite, and >90%...times they are a changing
what does it mean when ALL sota models score 95%+ with different cli's? (codex, claude code, etc)
aider seems to be a drag on their performance now, but we're back to the bigger question of can these be 'plugged in' to software workflows and what percentage of code will they be writing by EOY
easily one of my favorite mcp servers! I have even started creating end to end tests as markdown files and claude code will add notes on the results as it runs the tests.
happy for America though <3 πΊπΈ
the awkward part of participating in #nokings when you're from sac..
as a proud American I'm disgusted by the cowardice the republican party in congress has shown, it's up to us to remind them who they represent.
republicans have shown they are worse than those they hate, they have:
NO budget
NO beliefs
NO backbone
saturday we remind them we all have #NOKINGS
lots of places across the country this weekend where you can grab a bite, listen to some music, and use your voice with no kings to be found. hope to see you out there!
peaceful protests are a healthy and important part of democracies, it's a way 'we the people' can communicate our needs to our representatives on urgent issues prior to elections being held.
the great thing about having no kings in this country is we don't need to interrupt our lives for an unnecessary military parade. we can spend time with friends, family, others and enjoy our weekend.
also worth noting, i don't expect this server to be used for anything, but needed 'something' to throw the 'agent' against. code is likely to be trashed after this evening, but it's slop with a purpose
currently using this to vibe code an mcp server. i'm getting a crash course in javascript tdd and packaging. :)
also going to use pre-commit hooks more in my other projects, i knew it was possible but the reward didn't seem to match the effort of implementing. llms lower that bar, just like for tdd
what does it mean when ALL sota models score 95%+ with different cli's? (codex, claude code, etc)
aider seems to be a drag on their performance now, but we're back to the bigger question of can these be 'plugged in' to software workflows and what percentage of code will they be writing by EOY
please help fill in some blanks!
openai: really gets the ergonomics of an ai chat interface (ChatGPT)
anthropic: really gets the ergonomics of ai agents (Claude Code, MCP)
google: really gets the ergonomics of __________
deepseek: really gets the ergonomics of ___________
#ai #llms
using claude with artifacts in the web client, I'm not surprised claude code gets this nickname
@anthropic.com what are you guys feeding him?? this model is cracked.. claude.ai/public/artif...
help! i just asked claude opus to build bloxorz and it's begun optimizing it in a dev loop. it hasn't let me test since version 4...
Another fun fact is the new Gemini 2.5 Flash (non-thinking) preview model performs on par with Claude 3.7 Sonnet (non-thinking) on this benchmark for 1/10th the price...
My experience with memory lines up well with yours. I was pretty protective of the list of 'memories' before the update, but had to turn it off once it started pulling in things from previous conversations. chatgpt seems to be moving away from power users to simplify things for the casual users
Gemini 2.5 Pro is a beast for the price and, given 8 tries per problem, scores 93% on the Aider benchmark. o4-mini (high) was not far behind, but cost about 30% more to score 91%. Sonnet 3.7 without thinking is able to average 87% but cost about $45 per run (more than double the cost of gemini)
π
that feeling when you can close your 25 tabs/windows because you hit post π
thankfully, the people and USA continue to show it is strong enough to rebuff the would be dictators and totalitarians. building your communities only strengthens that defense, and it helps remind us that we're all in this together.
Mind blown moment: Apple's unified logging system captures EVERY framework and system component your app touches. π€―
I had no idea! In my new post, I reveal how to tap into this goldmine of debug context β it's all there waiting for you. π
www.fline.dev/better-error...
#Swift #iOSDev #SwiftUI
"it's just statistics and orthogonal projection.." i mumble as the text predictor doing my work continues to confuse me