Advertisement · 728 × 90

Posts by Post45 Data Collective

Please consider voting for us for Best DH Resource 2025!

And while you're at, cast a vote for Grant Wythoff and @teddyleane.bsky.social's _Time Horizons of Futuristic Fiction_ in the category of Best Dataset!

So many awesome projects—and anyone can vote.

3 weeks ago 6 3 1 0

We are very grateful that J.D. served as one of our faculty respondents at the inaugural Post45 Data Collective Grad Workshop.

It was wonderful to gather with such bright young scholars and to see all the creative research they're doing with, against, and alongside cultural data.

1 month ago 3 1 0 0
Preview
Home The Post45 Data Collective peer reviews and houses literary and cultural data from 1945 to the present.

Just got out of the first ever @post45data.bsky.social grad workshop. Fantastic projects drawing and testing the limits of the literary and cultural data sets. You can find the peer-reviewed sets at the link: use them, critique them, submit new ones.
data.post45.org

1 month ago 5 2 0 1

👋

data.post45.org

1 month ago 4 0 0 0

Half way done teaching Intro to Python for humanities - been great! Anyone have ideas for public datasets focused on culture for students to work on for final projects? @mellymeldubs.bsky.social @laurenfklein.bsky.social @nolauren.bsky.social @dmimno.bsky.social @tedunderwood.com @mariaa.bsky.social

1 month ago 25 8 13 3
Preview
Our Tools – Post45 Data Collective

We presented on our tool for enriching and clustering book data at Code4Lib today. Check it out, and let us know what you think!

data.post45.org/our-tools.html

Huge thanks to @thisismattmiller.com for leading development on this project.

#code4lib #c4l26

1 month ago 9 8 0 0
Preview
International Bestsellers Dataset Featured on Lit Hub & Substack – Post45 Data Collective James Folta (@jamesfolta.com) and F. Poretti (@fporetti) dive into our International Bestsellers dataset to investigate what the world has been reading and what great books we might have been sleeping...

James Folta (@jamesfolta.com) and F. Poretti (@fporetti on Substack) dive into our International Bestsellers dataset to investigate what the world has been reading and what great books we might have been sleeping on. data.post45.org/news/intl-be...

1 month ago 6 3 0 1

There's so much great (peer-reviewed!) data hanging out at @post45data.bsky.social for people to explore and play with! 100 years of major prizes (and the judges)! Everyone who went to Iowa Writers' Workshop (and who they studied with)! All NEA lit awardees! The Canon of Asian Am Lit! So much more!

1 month ago 22 11 2 3

TFW you discover someone went and did the thing you were struggling to do like 5 years ago. 💖 And it's open data, which is especially remarkable given how much the data broker from France (the only one I was even able to track down) wanted to charge for this kind of data. 😵‍💫

1 month ago 16 7 1 0
Preview
Find your next read in this dataset of international bestsellers. In most situations when I say “I need the data,” I’m referring to gossip, and it’s less of a “need” than what some would call a “messy curiosity.” But recently, I came across a Substack post analyz…

Come “poke around” our data “to find new books and authors [you’ve] never heard of!” @jamesfolta.com makes great use of the Int'l Bestsellers dataset built by @sdileonardi.bsky.social, @beccacohen.bsky.social, & @dan-sinnamon.bsky.social on @literaryhub.bsky.social. lithub.com/find-your-ne...

1 month ago 24 10 0 3
Advertisement

it makes me so happy that a column I once managed at Publishing Trends could be a source for such cool data.

2 months ago 10 4 0 0
Preview
Find your next read in this dataset of international bestsellers. In most situations when I say “I need the data,” I’m referring to gossip, and it’s less of a “need” than what some would call a “messy curiosity.” But recently, I came across a Substack post analyz…

Stoked to see @jamesfolta.com in @literaryhub.bsky.social talking up a dataset on int'l bestsellers built by @sdileonardi.bsky.social + @beccacohen.bsky.social (I helped). See who the world reads!

One nit: Folta sez "Sinykin is a great follow on Bluesky"—LIES lithub.com/find-your-ne...

2 months ago 18 5 3 3
Preview
Find your next read in this dataset of international bestsellers. In most situations when I say “I need the data,” I’m referring to gossip, and it’s less of a “need” than what some would call a “messy curiosity.” But recently, I came across a Substack post analyz…

"The world doesn’t have terrible taste, it seems."
— @jamesfolta.com writing about the @post45data.bsky.social's International Bestsellers data for
@literaryhub.bsky.social!

lithub.com/find-your-ne...

2 months ago 11 7 1 0
Preview
Publishing flows (no domestic) A Flourish data visualization by cody

When we published international bestseller data with @post45data.bsky.social all I wanted was to make a Sankey diagram but never managed, now someone has. And it looks beautiful. Check it out!

Source: substack.com/home/post/p-...
public.flourish.studio/visualisatio...

2 months ago 9 4 0 0

Excited to see the dataset being used!!

2 months ago 7 3 0 0
Preview
Literary Nationalism Why don't Americans read more European fiction? Why don't Europeans?

Cool new essay that analyzes data from @post45data.bsky.social to argue for the rise of literary nationalism in parallel with political nationalism substack.com/home/post/p-...

2 months ago 23 11 3 1

Used Claude Code to analyze three datasets from @post45data.bsky.social together: NEA literature winners; NYT bestsellers; major prize winners. Made a power list of the 51 writers (out of ~7000) who hit all three open.substack.com/pub/sinykin/...

2 months ago 14 2 0 1
Advertisement
Preview
The NEA is a Zombie What Was the NEA?

Started mucking around with data this morning. Made a couple interesting charts from data about all the writers who ever got an NEA grant, with data from Xander Manshel + team. Decided I’d write it up for my first post.
substack.com/home/post/p-...

2 months ago 15 7 2 0
Preview
BookReconciler: An Open-Source Tool for Metadata Enrichment and Work-Level Clustering We present BookReconciler, an open-source tool for enhancing and clustering book data. BookReconciler allows users to take spreadsheets with minimal metadata, such as book title and author, and automa...

A hard problem with literary data is navigating btwn editions of books and what the "work," or the theoretical text that unites all editions. I've been lucky to work with @thisismattmiller.com and @mellymeldubs.bsky.social, who built a tool to address this + do much more

arxiv.org/abs/2512.10165

4 months ago 64 22 4 1

This project has been supported by the @post45data.bsky.social.

It was initially funded by an NEH grant led by @dan-sinnamon.bsky.social and me. Then our NEH got cancelled. But we persisted!

Matt, Dan, and I have been working on this project for years at this point.

4 months ago 10 2 1 0
Diagram illustrating the BookReconciler workflow. On the left, a book cover of The Book of Salt by Monique Truong appears alongside “Minimal Metadata,” listing Author: Truong, Monique and Title: The Book of Salt. An arrow points to a box labeled “BookReconciler” with book and diamond icons. A downward arrow leads to “Enriched + Clustered Metadata,” showing multiple editions of the book cover and expanded metadata, including several ISBNs, subject headings (e.g., Vietnamese–France fiction, women authors, household employees, gay men, cooking), and an author VIAF identifier.

Diagram illustrating the BookReconciler workflow. On the left, a book cover of The Book of Salt by Monique Truong appears alongside “Minimal Metadata,” listing Author: Truong, Monique and Title: The Book of Salt. An arrow points to a box labeled “BookReconciler” with book and diamond icons. A downward arrow leads to “Enriched + Clustered Metadata,” showing multiple editions of the book cover and expanded metadata, including several ISBNs, subject headings (e.g., Vietnamese–France fiction, women authors, household employees, gay men, cooking), and an author VIAF identifier.

Very happy to introduce a new tool, BookReconciler!

You can take spreadsheets with book data and add subject headings, descriptions, ISBNs, HathiTrust IDs, & more. You can also cluster editions & variations of the same "Work."

Led by @thisismattmiller.com and supported by @post45data.bsky.social.

4 months ago 123 56 7 1

Are you a grad student working on post-1945 culture? Could your research benefit from incorporating some data, even minimally? Want feedback from journal editors?

This Post45 Data Collective virtual workshop may be for you!

Applications are due DECEMBER 1: data.post45.org/news/grad-wo...

4 months ago 15 25 1 1
Post image

The Post45 Data Collective invites graduate students in the humanities or adjacent fields to explore cultural data reflexively and collaboratively in a mini-workshop hosted virtually on Friday, March 13. Details here: data.post45.org/news/grad-wo...

5 months ago 8 10 0 1
Post image Post image Post image Post image

Back again with the Selected British Literary Prizes dataset for #TidyTuesday. It was really nice to see LGBTQ+ data included🏳️‍🌈. I also had a look at ethnic diversity across prize institutions and winner vs. shortlist % for nominees from the top 15 unis.
lewis-ward.github.io/tidytuesday/...

5 months ago 6 2 0 0
A heatmap showing gender and ethnic representation in UK literary prizes, split into two panels for shortlisted authors and winners. The visualization displays counts across ethnicity categories (African, Asian, Black British, Caribbean, Irish, Jewish, Non-UK White, Non-White American, and White British) on the y-axis and gender categories (Man, Non-binary, Transman, and Woman) on the x-axis. Each cell shows the count with white text on a colored background, where darker colors indicate higher counts. White British men show the highest counts in both panels (120 shortlisted, 149 winners), followed by White British women (123 shortlisted, 105 winners). Non-UK White authors also show substantial representation (39 men and 87 women shortlisted; 40 men and 32 women winners). Most other ethnic groups show single-digit counts, with several cells having zero representation. The data reveals that White British authors increase from 46% of shortlisted to 61% of winners, while women decrease from 54% of shortlisted to 43% of winners.

A heatmap showing gender and ethnic representation in UK literary prizes, split into two panels for shortlisted authors and winners. The visualization displays counts across ethnicity categories (African, Asian, Black British, Caribbean, Irish, Jewish, Non-UK White, Non-White American, and White British) on the y-axis and gender categories (Man, Non-binary, Transman, and Woman) on the x-axis. Each cell shows the count with white text on a colored background, where darker colors indicate higher counts. White British men show the highest counts in both panels (120 shortlisted, 149 winners), followed by White British women (123 shortlisted, 105 winners). Non-UK White authors also show substantial representation (39 men and 87 women shortlisted; 40 men and 32 women winners). Most other ethnic groups show single-digit counts, with several cells having zero representation. The data reveals that White British authors increase from 46% of shortlisted to 61% of winners, while women decrease from 54% of shortlisted to 43% of winners.

I created a heatmap for this week's #TidyTuesday showing UK literary prize demographics with data from @post45data.bsky.social. White British authors account for 61% of winners vs. 46% shortlisted. Men win 56% vs. 46% shortlisted.

Code: github.com/gkaramanis/t...

#RStats #dataviz

5 months ago 12 2 0 0
Post image

This week's #TidyTuesday data was interesting. I decided to look into the Costa Book Awards and the gender distribution of their winners.

Data: github.com/stats33100/t...

5 months ago 8 2 0 0
Post image

Looked at the percentage of Oxbridge educated people who have won various British Literary Prizes for #TidyTuesday. I used ggbrick and, as the bricks of look like books, tried to make it look like they were in bookshelves with meh results.

Code here: tinyurl.com/bddsuuc3

#rstats | #dataviz

5 months ago 4 1 0 0
Advertisement

TWO great opportunities for graduate students in post45 literary studies in 2026! The Post45 Graduate Symposium @ Duke and @post45data.bsky.social’s virtual workshop for research with their data sets.

5 months ago 5 4 0 0

Excited to get to sit in on this workshop! Hopefully, @post45.bsky.social will get the opportunity to publish some articles that emerge from this P45DC event 🤩

5 months ago 6 4 1 0

We're excited to host a virtual workshop where graduate students can present work-in-progress that engages with data from the Post45 Data Collective in some way.

Editors from the P45DC, Public Books, Post45, and the 19th Century DC will be there to offer feedback.

Proposals are due December 1!

5 months ago 14 8 0 2