Stuart Gray (@sgray) Bsky

Agreed.

It’s also interesting to note the similarity of the underlying thinking behind this appointment with Labours approach to handling the electoral threat from Reform - both seem to take the same fundamental approach, and both seem equally wrong for the same reasons.

4 hours ago 6 0 0 0

...significant point from Robbins - he says, in his view, the vetting process is SUPPOSED to throw up potential problems precisely so they can be mitigated and managed, not to be a "pass/fail piety test"...

4 hours ago 13 2 3 0

This the key contradiction in the current drama.

Until now, DV has been presented as a binary yes/no process by Media and politicians.

OR is claiming it’s a risk management process - which is pretty common in large orgs.

Doesn’t mean someone couldn’t ultimately fail, but it’s nuanced not binary.

4 hours ago 0 0 0 0

The more interesting but less newsworthy aspect (because it doesn’t make for a click bait headline) seems to be that OR is claiming vetting isn’t a binary yes/no process, but a nuanced risk management one.

This is the contradiction with Starmers stance, and what could undermine him.

5 hours ago 1 0 0 0

The most basic fundamentals of cognitive processing.

Humans observe events, narrate stories to explain them, and assign emotions to those stories.

This helps illustrate how influencers can emotionally manipulate outcomes with biased or false narratives around events.

1 day ago 68 15 2 0

A total ban on phones in schools is a completely batshit, over the top, policy which shows, yet again, how out of touch Labour is with the reality of many people's lives. This will be totally unworkable for many people, including young carers as mentioned below. #r4today
bsky.app/profile/lore...

9 hours ago 183 35 10 1

I don’t think it’s a profits push, Anthropic themselves have admitted that they’re “compute constrained”.

There are half a dozen large corps competing for a fixed, limited production capacity.

With growing AI use, the only alternative is restricting or limiting demand, pricing is part of that.

7 hours ago 2 0 0 0

There were screenshots being shared last week suggesting Anthropic were A/B testing 25% increases to paid plans.

20 hours ago 2 0 1 0

Welp, can't say I didn't see it coming, they cut down everyone's salaries about a month ago due to limited runway, but the company I was working for laid me off.

Need full time TypeScript and/or Solidity work ASAP

20 hours ago 132 43 5 0

OMFG I had to go and check that this ACTUALLY happened.

Since when does the BBC ever do chyrons with a political party's branding, rather than their own? Not to mention this is during a pre-election campaign purdah.

(h/t @iainsol.bsky.social)

21 hours ago 2666 1219 340 347

Yeah, as an Introvert, this one of the impacts of AI I fear the most.

Early signs point to it being a boon for Extroverts - their personal introvert in a box.

Relationships, networking, marketing, and service-based roles all feel like they’re going to be the price of entry to an AI heavy world :/

1 day ago 6 0 1 0

Thanks. I just tried again, but this time in private mode, and the link worked fine (in Opera mobile).

I can only think maybe my cookie store is corrupt somehow.

Had me worried though!

1 day ago 1 0 0 0

Ok, this is really odd.

Either something has recently changed on the BBC news website, or I’ve not been paying attention.

It looks like all sports articles that are older than 24 hours require you to be signed in to a BBC account.

I guess I won’t be browsing old BBC articles anymore.

1 day ago 1 0 1 0

Ok, this is bizarre.

As a UK resident TV license payer, why do I have to sign in to a BBC account to read this article.

The BBC have been pushing this for a while now, but this is the *first* article I’ve seen where an account is *mandatory*.

WTF!?!

1 day ago 1 0 1 0

Interesting.

I can’t say I’d heard of the term “glass palace” in this context before.

That said, one place I worked at years ago had a setup like this and the room was unofficially referred to as “the fish bowl” by staff.

1 day ago 0 0 0 0

He’s not wrong. This is insane.

At the bare minimum, I expect a control group to exist that follows the same process but with a calculator, and another ideally with an expert human assistant - both of which are withdrawn half way through.

So much poor quality research grabbing headlines :/

1 day ago 6 0 0 0

Absolutely incredible.

NASA Astronaut Reid Wiseman, who commanded Artemis II, took this footage from the far side of the Moon with his iPhone.

Watch with sound on.

1 day ago 15655 4668 300 548

For comparison, the full 4.6 system prompt is ~270k in size.

The base tooling takes up roughly 50% of that, but the exact size depends on your settings, configuration, and plugins so it can be much bigger.

The behaviour section, which is the part Anthropic publishes, is ~20k.

1 day ago 5 0 0 1

Careful.

Although it’s good & useful that Anthropic publish part of their system prompt, it’s only a small part (less than 10%).

The whole prompt is huge, especially if you include tool definitions.

There’s a few repos that track leaked versions of the full prompt e.g.

github.com/asgeirtj/sys...

2 days ago 6 0 1 0

Wow, Bungle’s really hit hard times, so sad.

2 days ago 0 0 0 0

Amazon is discontinuing Kindle for PC on June 30th - Good e-Reader Amazon is shuttering Kindle for PC in June and releasing a new standalone app only compatible with Windows 11.

I was kinda on the fence about staying with Amazon Kindle after they dropped support for my e-reader (for DRM reasons), but this kind of seals it.

Also, to prevent DRM workarounds, they’re dropping the PC Kindle app in place of a new Windows 11 only app.

goodereader.com/blog/kindle/...

3 days ago 0 0 0 0

max slinger reply

How is the “Max Slinger” account still not banned? This is an AI agent account that isn’t properly labeled as a bot AND it’s replying to people unsolicited constantly as well as follow farming. It’s breaking so many community guidelines and needs to be taken down.

4 days ago 102 9 8 3

I read that they set the default thinking for 4.7 to high.

However benchmarks show that both low & medium thinking uses less tokens on average than 4.6, while also giving better responses.

4.7 low ~= 4.6 medium, and 4.7 medium ~= 4.6 high in capability, with slightly less token usage.

4 days ago 2 0 1 0

Monty Python: Oh Lord! You Are So Big! YouTube video by 247adam

m.youtube.com/watch?v=fINh...

4 days ago 0 0 0 0

Monty Python: "Let us praise God. Oh Lord, oooh you are so big. So absolutely huge. Gosh, we’re all really impressed down here I can tell you. Forgive us, O Lord, for this dreadful toadying and barefaced flattery. But you are so strong and, well, just so super. Fantastic. Amen."

4 days ago 5 0 1 0

I wonder how much of this is due to the system prompt?

4 days ago 0 0 0 0

Starmer told the companies that “things can’t go on like this” and warned they were “putting our children at risk”, in remarks at the start of the meeting. Kendall pointed to porn sites such as OnlyFans as an example of how age checks could be used robustly and urged the social media industry to follow their example, according to multiple people familiar with the meeting.

While, on one hand I love adult being recognized for its best-in-class verification procedures, requiring that everyone have their government ID and biometrics linked to a social media account is terrifying.

That's the end of an anonymous internet, and a threat to political dissent.

4 days ago 68 20 1 3

For those unaware, the current ongoing Bluesky issues appear to be caused by a large scale DDoS attack against key network points.

That’s why it’s taking a while to resolve, and it’s unclear when it’ll be resolved at the moment.

4 days ago 2 0 0 0

Bluesky has activated the Emergency Pervert Filter, which makes the site basically unusable for known perverts. This filter was activated hours ago. Anyone struggling to use the site should notify their local police. Thank you!

4 days ago 4071 518 107 63

This type of change is inevitable. All commercially hosted model versions get discontinued eventually.

This scenario is one of the better ones - the same family, with more capability.

Maybe you should create some evals to help surface any qualia differences between models in areas of importance?

4 days ago 1 0 0 0

Posts by Stuart Gray