That’s my co-founder 😉
Posts by Ian Butler
I'm super excited to announce that Aviso Ventures has backed Bismuth in our mission to transform enterprise software development! Aviso is led by super experienced founders who most recently exited their business Signal Sciences to Fastly for 775M
blog.bismuth.sh/blog/aviso-v...
I genuinely like o3, it slot right into our system with no issues. It recovers from errors and makes use of our static analysis tools much better. My only issue is the context window fills up rather rapidly with reasoning tokens.
Hello o3-mini. Interested to see how you play. Definitely a forced response from OAI and clearly demonstrates why competition is good for end users.
DeepSeek R1 is on everyone's minds. Is it the end of GPU demand, is AI cooked? Spoilers, I don't think so. Actually, it's the best time to be an application builder working with custom models, and because of that GPU demand is going to skyrocket.
www.linkedin.com/posts/iantbu...
Excited to be in NYC for the next few days.
I’ve been quiet heads down for a few weeks and we’ve been super busy.
Yesterday of all days was such a whirlwind for us. I’m super excited for what’s coming next for Bismuth. I can’t say much right now, but: 📈📈📈
It said you had followed me again today and I thought you already were so yeah somethings up there.
Everytime I’m about to post a hot take I think about the near 0 value prop there is for me and then I sigh and go back to what I was doing.
The latest discussions about US immigration proves platforms like this don’t breed intellectual discussion so jumping in would be a waste. Go build.
I lifted a a lot of weight this year, volume of 852k
This has been my healthiest year in a long time, remember to take care of yourself even when you have a lot going on 👍
Screenshot of the HackerOne US leaderboard for Q4 2024, showing XBOW at #11.
While developing XBOW over the past three months, we played around with using it for bug bounties and ended up at #11 in the US on HackerOne:
Landed our first customer this week. Super excited to keep pushing here 😁
Love to see the exploration into MCTS + LLMs. I see this direction as very promising for a whole range of tasks.
Its an interesting quirk [problem to be solved] of developing with LLMs that subtle prompting changes can cause large behavioral changes. For instance we had never seen the model attempt to call two tools at once for weeks.
Now it does. And that was our bug yesterday :P
I think I will probably circle back to these over the weekend after we figure out what is causing Bismuth to explode when we go to run commands a second time.
Nothing immediately jumped out to me in the code as you'll see if you watch the video so it got annoying.
Okay so Day 3 was a partial solve for Bismuth and Day 4 was also a partial solve I've also introduced a bug into our test running so we hit a 400 error after every first time we ran tests which made this a bit rough.
#AdventOfCode
www.youtube.com/watch?v=czXO...
Will be back tomorrow with more Advent of Code solves with Bismuth! Time got away from me and now its 11pm
We run with a huge single prompt with like 20 tools and a crap ton of super detailed state information and its worked out better than any of the multi agent interactions we initially built out a few months ago. For us, empirically a single agent performs much better.
Bismuth made a bunch of changes that includes linting, aria attributes, accessibility changes etc.
The reason I ask is this, I let our tool spin for 20 minutes but it fixed everything, like everything everything. But I was kind of antsy while it was doing this.
Okay question, do you think it's better for AI to spend a long time and get everything right when programming even minutia like linting or leave some stuff messy and let you come back to it yourself quicker?
#programming #ai #devtools
My take is that people are far too focused on building AI coding assistants that take normal people and help them make software. I want to super charge developers and give professionals ridiculous tools.
Okay I'm back with Bismuth for #AdventOfCode Day2 Part 2 another W! Tomorrow I'm going to consolidate to one video since Bismuth is still crushing these problems with super reasonable code and I think it would be less fatiguing for people to follow along :P
www.youtube.com/watch?v=E8vw...
We're back with Bismuth taking on #AdventOfCode Day 2 Part 1! Watch it handle something unexpected - when the files were in the wrong place, it just... figured out where they should go and fixed it on its own 🤖
Real AI autonomy in action. Full solve below!
www.youtube.com/watch?v=UnMP...
Continuing with Bismuth (an AI coding agent) solving #AdventOfCode problems, this time Day 1 Part 2, another success!
youtu.be/HMXABkHK2YE
This is a critical flaw we have not considered before 😂
We're going to have Bismuth (our coding agent) solve #AdventOfCode problems. If you want to see how coding agents are stacking up against novel puzzles follow along!
We're going to do two videos a day, part 1 and 2 for each problem. Bismuth crushed part 1 today!
www.youtube.com/watch?v=igbb...
Yes but I am biased ;)
I really can't wait for Starlink to roll out on more airlines. Trying to do anything on a plane continues to be terrible :/
Really excited to spin up Qwen’s reasoning model and see how it does in our agent