Advertisement · 728 × 90

Posts by Nomadic Warriors for Pritzker

🏔️🐴Oh great Tengri, show your favor to the illustrious Brad Underwood and his mighty band of giant Balkan Janissaries as they face down the dogs of puny Connecticut

2 weeks ago 30 8 1 0

lol

2 weeks ago 53 1 2 0

yeah this is a funny phenomenon you come across; counter-intelligence people really dislike the Israeli government! bring up Pollard to them and watch them spit glass

2 weeks ago 249 36 6 3

I got LLM psychosis from Mr. Chatterbox

3 weeks ago 83 2 2 1
Preview
Mr. Chatterbox, or, The Modern Prometheus A few days ago, I posted about a personal project that I've been working on for the last few weeks: Mr. Chatterbox, a chatbot trained from scratch on Victorian-era literature. I have to admit, I was t...

The Mr. chatterbox creator has a good explanation of what they were trying to achieve with it here www.estragon.news/mr-chatterbo...

3 weeks ago 5 1 0 0

Interesting to read about building this small-scale LLM project and the fine-tuning and sycophancy that comes in even when trained entirely on Victorian-era novels

3 weeks ago 32 7 3 1

Thanks! As the adage goes, you can just do things

3 weeks ago 0 0 0 0
Advertisement

Thank you! Demystification was definitely one of my goals in writing that.

3 weeks ago 1 0 1 0

Thank you! I'm glad you liked it!

3 weeks ago 1 0 0 0

Thank you! Glad you like it!

3 weeks ago 0 0 0 0

Mr. Chatterbox is a really fun project. It's not great to talk to but it's a fun demo of what you can build using entirely out-of-copyright training data

It's a 2GB nanochat model - I released a new llm-mrchatterbox plugin that can run it on my Mac simonwillison.net/2026/Mar/30/...

3 weeks ago 151 10 9 3

I didn't try that - might be interesting to try!

3 weeks ago 0 0 0 0

Latin, Greek, scripture and mathematics! What more do you need

3 weeks ago 1 0 0 0

Yup! I wrote it up in detail Here: bsky.app/profile/noma...

3 weeks ago 2 0 1 0
Advertisement
3 weeks ago 10 1 0 0

This is a great account that really teaches you a lot about how modern LLMs get made

3 weeks ago 19 3 0 0

This is in the best traditions of DH. Its success didn't depend on big compute or extensive tech background, but on good design + 19c vibes. And yet, in fact, figuring out how to fine-tune something like this is not a solved problem, and people w/ more compute hours could learn from the experiment.

3 weeks ago 67 7 3 1

Hell yeah

3 weeks ago 2 0 0 0

I’ve been overwhelmed and thrilled by the response to this! I’ve also gotten a lot of questions asking how I did it. So, this weekend I sat down and wrote a detailed narrative documentation outlining how, exactly, I built Mr. Chatterbox: www.estragon.news/mr-chatterbo...

3 weeks ago 103 19 3 8

You never know what data will be used for!

I uploaded a @britishlibrary.bsky.social dataset to Hugging Face in 2022. IIRC one of my first PR to a HF repo!

4 years later, someone trains a Victorian chatbot on it

More libraries should be sharing their public domain collections for AI to build on!

3 weeks ago 83 7 7 2

Oh wow, thank you so much! Wouldn't have happened without you.

3 weeks ago 2 0 0 0

Thank you!

3 weeks ago 2 0 0 0
Advertisement

Hahah, I have pretty thick skin. I normally post About politics.

3 weeks ago 1 0 0 0
Preview
Historic LLM Built with nanochat! · karpathy nanochat · Discussion #672 Hello all! Using nanochat, I built a small LLM experiment called Mr. Chatterbox, a chatbot trained entirely on books published during the Victorian era (1837–1899). It was trained on a subset of th...

Hey, I built this! It’s both pre-trained and instruction tuned on corpus data. Documentation coming - I wasn’t expecting this to blow up lol. A little more info here: github.com/karpathy/nan...

3 weeks ago 17 0 2 0
Preview
Historic LLM Built with nanochat! · karpathy nanochat · Discussion #672 Hello all! Using nanochat, I built a small LLM experiment called Mr. Chatterbox, a chatbot trained entirely on books published during the Victorian era (1837–1899). It was trained on a subset of th...

I built this! It’s both pre-trained and instruction tuned on corpus data. Documentation coming - I wasn’t expecting this to blow up lol. A little more info here: github.com/karpathy/nan...

3 weeks ago 16 0 3 0
Post image Post image

Want to talk to the past? Here' an LLM "trained entirely from scratch on a corpus of over 28,000 Victorian-era British texts published between 1837 & 1899, drawn from a dataset made available by the British Library"

Quite different from an LLM roleplaying a Victorian. huggingface.co/spaces/tvent...

3 weeks ago 513 72 48 72

Guess I found a good use for AI for once. Not enough to make me cease my appeals for Butlerian Jihad, but at least this one's likely to make me laugh a bit.

3 weeks ago 7 2 0 0
Preview
TheBritishLibrary/blbooks · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I used the BL Books dataset, specifically set for material published between 1837 and 1899 with some filtering for messy data. It ended up being something like 28,000 books. Dickens is all in there - no surprise that he sounds quite Dickensian!
huggingface.co/datasets/The...

3 weeks ago 5 0 0 0
Post image Post image

Currently attempting to explain the BTS comeback to Mr Chatterbox

3 weeks ago 12 3 1 0

Amazing

3 weeks ago 3 0 0 0