Anthropic appear to be experimenting with Claude Code pricing. I expect more of this, not just from Anthropic, as investors seek returns.“Strategically, should I be taking a bet on Claude Code if I know that they might 5x the minimum price of the product?” - a useful question to ask.
Posts by Doug Clow
Their defence team should argue for the case to be summarily dismissed on the longstanding legal principle of 'de minimis non curat lex': the law does not concern itself with trifles.
What economists do with charts is genuinely shocking and offensive. Not just having their axes the wrong way round, there's calling things 'curves' when it's a load-bearing assumption that it's a linear relationship, and using 'elasticity' loosely to mean 'coefficient' when it's the other way round.
I just found my random five-dept hospital again and regenerated it a few times, and to my amazement came up with one with zero incidences. Back of the envelope (0.95^100) says that should only occur <0.6% of the time but I doubt I've done more than a few dozen. Randomness is very, very strange.
This is a fun romp through the utter confusion that is sub-national government in England just to answer a seemingly simple question. In terms of tracking data and trends, recent governments’ piecemeal changes make it worse, because they are adding to the long list of dates things changed.
How have they got to this? The US military used to pride itself on its ability to get plenty of food to its people wherever they were on the planet. Big part of what makes them so threatening! Hegseth focus on “lethality” rather than logistics and sustainment?
I recommend the @3blue1brown.com videos on this: www.youtube.com/watch?v=wjZo... and www.youtube.com/watch?v=eMlx...
I don't generally like videos but these helped me see the point. Like, as lifelong ale drinker, finally appreciating a cold lager when I had one after struggling up a warm mountain.
Also agree usage patterns are changing and people are using more tokens. One big structural shift over the last year or two is the move to spending a boatload of compute on inference, which was a transformative win in terms of output quality but obviously has unattractive scaling properties.
Yeah the numbers are opaque and clearly hard to pin down but it has to be about a tenth of that around now. That's at the moment, of course they have growth plans which are ... ambitious, and they aren't expecting profitability until 2029/2030.
I think I would quibble with your vg ratings there. All three of them were capable of extraordinary mental rigour and thoroughness, and (yet) I disagree with many of their positions. With Wittgenstein you can even get him to do the work for you with early vs late.
At the moment I lean towards it turning out like image generators messing up fingers: it was a really difficult problem early on ... and now it's pretty much solved. But that is not a strong view, and one thing I have learned about LLMs is to be very cautious about my predictions of capabilities.
My worry since the launch of ChatGPT has been that bad things happen when the price of generating custom plausible bullshit falls to near zero. In coding, we have an unprecedented boom in technical debt. The code quality collapse could turn out anywhere between trivial roadbump or a massive tarpit.
Yeah, obviously stonks, but that (imperfect) study that coders thought they were faster but were actually slower rhymes with my experiences and haunts me. But the models are ever better, and my vibes (incl coder friends) is that there are indicators of a coding productivity boom more recently.
This is an absolutely fabulous job and I am infuriated that I don’t have the profile they want. @ourworldindata.org is one of the best websites in the world and this is a chance to write for them! Great application process too - not quick but they pay along the way.
I know there are big gaps in health life expectancy by wealth but looking at the actual numbers is still shocking. *20 years* healthy life expectancy between top and bottom deciles!
(Caveat: Not all the causation runs deprivation -> poor health. Poor health also causes deprivation.)
I don’t know why I keep forgetting you were a cox and then being reminded and going “oh of *course*”! Rhythm and predictability is exactly what we were short of. By design - it was an immersive space adventure (Bridge Command) and the staff/cast are excellent at pacing.
I had live experience yesterday of a beautifully steady “On my mark, 3, 2, 1, MARK!” failing to coordinate five people to act within 3s. Desperate “Now! Do it! YES NOW! Go go go!”. In our defence it was all very confusing and exciting. In retrospect we should have spent longer reviewing the plan.
You can sound even more convincingly medical by saying the same thing for toes but in cod Latin. The scientific* names for toes are porcellus fori, porcellus domi, porcellus carnivorus, porcellus non voratus, porcellus plorans domum.
The bullish case is of course that future models will be so much more capable and add so much more value that the eyewatering income growth needed will come easily. Maybe. That would also take us in to profound paradigm shift territory, with the possibility of doomy scenarios.
It’s been obvious for a long time that most LLMs currently charge way less than their costs. That is unsustainable so it will not be sustained. To make a return, they will have to make a lot more money. Worth considering before integrating them in to long-term plans, and defending against lock-in.
The Right call out for synchronising effort is “Ready … steady … GO!”. Others are ambiguous - including this suggestion (it could also be “Three … two … ONE!”). In practice it’s Ok if you tell people the call first (eg “On my mark, 3, 2, 1, MARK!”) but everyone* already understands Ready Steady Go.
Photo of a load of bluebells. One is artily close up in the bottom left thirds point but out of focus, as is the photographer had a better photo in mind but failed. They’re still good bluebells though.
Bluebells! Fantastic stuff. I could smell them before I saw them.
Encouraging to have an MP responding to a consultation on a major infrastructure project, noting that it affects their constituency significantly … and saying we should get on with it! I agree.
No easy answers to what should be monitored centrally, beyond "get the right data to the right people who are empowered to do the right thing at the right time." As a data person I'm always going to love having more data centrally, but that is not the only factor, and it can come with a cost. 5/5
As Goodhart says, "when a measure becomes a target, it ceases to be a good measure". Anything you measure and gather centrally in a corporate context will start to function as a target to people who can affect it. The best middle managers are adept at it. A huge power for good, and for bad. 4/
The senior mgr sees they're on 9%, nearly 2x the 5% benchmark, while their (randomly generated) peer is on 2%. They reward the Outstanding Head of Dept and send them to performance-manage the bad ones. So savvy HoDs make sure never to put more than one person on perf mgt, even if they should be. 3/
Let's say needing to go on perf mgt is random and 1 in 20 people should be. I just modelled five random depts of 20 people. Two of them have 1 employee on performance management, one Outstanding dept has none, and two Needs Improvement depts have 3 and 4 respectively. Purely randomly! 2/
This seems very believable to me, and - without knowing much about the inside of the DoH - might even be a wise move. If you make the number of performance management plans a central metric, managers will feel pressure to keep it at the "right" level. That may well not be helpful pressure. 1/
My argument is that <0.5% is within measurement error of 0%, and we have reasons to suspect increasing error, so the claim that people have on average become better off is true on this data, but it's only just better off, and actual could be falling. Also incomes are not growing like they used to.
Last I looked in to it (years ago), residual fax use in the NHS was a desperation measure by overstretched, under-resourced staff to get the job done. It's a symptom of systemic underinvestment, not a sign of technical resistance, and if you simply ban it you make the problem worse not better.