(@jayk56) Bsky - nopzon.com

Also, attachments dont work 😅

2 months ago 0 0 0 0

Codex App - First Impressions | Jay Kerschner

new blog out with first impressions on codex app. gist is there's potential, but coming from the cli you'll need to relearn some things

jkerschner.com/posts/codex-...

2 months ago 0 0 0 1

wtaf..

2 months ago 0 0 0 0

Have you tried bringing it a cup of coffee with your prompt?

3 months ago 0 0 0 0

the answer was not quite, and >90%...times they are a changing

3 months ago 0 0 0 0

what does it mean when ALL sota models score 95%+ with different cli's? (codex, claude code, etc)

aider seems to be a drag on their performance now, but we're back to the bigger question of can these be 'plugged in' to software workflows and what percentage of code will they be writing by EOY

10 months ago 1 1 0 1

easily one of my favorite mcp servers! I have even started creating end to end tests as markdown files and claude code will add notes on the results as it runs the tests.

9 months ago 0 0 0 0

happy for America though <3 🇺🇸

10 months ago 0 0 0 0

the awkward part of participating in #nokings when you're from sac..

10 months ago 0 0 1 0

as a proud American I'm disgusted by the cowardice the republican party in congress has shown, it's up to us to remind them who they represent.

republicans have shown they are worse than those they hate, they have:
NO budget
NO beliefs
NO backbone

saturday we remind them we all have #NOKINGS

10 months ago 0 0 0 0

lots of places across the country this weekend where you can grab a bite, listen to some music, and use your voice with no kings to be found. hope to see you out there!

10 months ago 0 0 0 0

peaceful protests are a healthy and important part of democracies, it's a way 'we the people' can communicate our needs to our representatives on urgent issues prior to elections being held.

10 months ago 0 0 0 0

the great thing about having no kings in this country is we don't need to interrupt our lives for an unnecessary military parade. we can spend time with friends, family, others and enjoy our weekend.

10 months ago 0 0 1 0

also worth noting, i don't expect this server to be used for anything, but needed 'something' to throw the 'agent' against. code is likely to be trashed after this evening, but it's slop with a purpose

10 months ago 0 0 0 0

currently using this to vibe code an mcp server. i'm getting a crash course in javascript tdd and packaging. :)
also going to use pre-commit hooks more in my other projects, i knew it was possible but the reward didn't seem to match the effort of implementing. llms lower that bar, just like for tdd

10 months ago 0 0 1 0

what does it mean when ALL sota models score 95%+ with different cli's? (codex, claude code, etc)

aider seems to be a drag on their performance now, but we're back to the bigger question of can these be 'plugged in' to software workflows and what percentage of code will they be writing by EOY

10 months ago 1 1 0 1

please help fill in some blanks!
openai: really gets the ergonomics of an ai chat interface (ChatGPT)
anthropic: really gets the ergonomics of ai agents (Claude Code, MCP)
google: really gets the ergonomics of __________
deepseek: really gets the ergonomics of ___________
#ai #llms

10 months ago 0 0 0 0

using claude with artifacts in the web client, I'm not surprised claude code gets this nickname

10 months ago 0 0 0 0

@anthropic.com what are you guys feeding him?? this model is cracked.. claude.ai/public/artif...

11 months ago 0 0 0 0

help! i just asked claude opus to build bloxorz and it's begun optimizing it in a dev loop. it hasn't let me test since version 4...

11 months ago 0 0 0 1

Another fun fact is the new Gemini 2.5 Flash (non-thinking) preview model performs on par with Claude 3.7 Sonnet (non-thinking) on this benchmark for 1/10th the price...

11 months ago 0 0 0 0

My experience with memory lines up well with yours. I was pretty protective of the list of 'memories' before the update, but had to turn it off once it started pulling in things from previous conversations. chatgpt seems to be moving away from power users to simplify things for the casual users

11 months ago 2 0 0 0

Gemini 2.5 Pro is a beast for the price and, given 8 tries per problem, scores 93% on the Aider benchmark. o4-mini (high) was not far behind, but cost about 30% more to score 91%. Sonnet 3.7 without thinking is able to average 87% but cost about $45 per run (more than double the cost of gemini)

11 months ago 0 0 0 1

👀

11 months ago 0 0 0 0

Joel Grus – Fizz Buzz in Tensorflow Posts and writings by Joel Grus

joelgrus.com/2016/05/23/f...

11 months ago 0 0 0 0

o4-mini and Gemini 2.5 Pro Can Ace Aider's Polyglot Benchmark | Jay Kerschner

jkerschner.com/posts/aider-...

11 months ago 0 0 0 0

that feeling when you can close your 25 tabs/windows because you hit post 😌

11 months ago 0 0 1 0

thankfully, the people and USA continue to show it is strong enough to rebuff the would be dictators and totalitarians. building your communities only strengthens that defense, and it helps remind us that we're all in this together.

11 months ago 0 0 0 0

Mind blown moment: Apple's unified logging system captures EVERY framework and system component your app touches. 🤯

I had no idea! In my new post, I reveal how to tap into this goldmine of debug context – it's all there waiting for you. 👇

www.fline.dev/better-error...
#Swift #iOSDev #SwiftUI

11 months ago 3 1 1 0

"it's just statistics and orthogonal projection.." i mumble as the text predictor doing my work continues to confuse me

11 months ago 0 0 0 0

Posts by