Thanks to @dallascard.bsky.social and @davidjurgens.bsky.social for their help on this project! We also received great feedback from members of the Blablablab and CLC lab
Posts by Ben Litterer
Interested in working with SPoRC? Our data, paper, and code for creating data and doing the analysis are freely available!
data: huggingface.co/datasets/bli...
paper: arxiv.org/abs/2411.07892
processing code: github.com/blitt2018/SP...
analysis code: github.com/blitt2018/SP...
We're excited for people to use this data to explore the dynamics of long-form conversation, linguistic style matching, diffusion of information, understanding power and prestige within the podcast ecosystem, and more!
What about the audio aspect of podcasts? We provide speaker turn information, along with audio features that capture this information, such as pitch, allowing future research to consider elements like emotion, humor, or sarcasm
...Discussion of George Floyd was widespread across categories, with 21% of podcasts saying his name in at least one of their episodes in our time-period. Furthermore, discussion of racial justice peaked quickly around George Floyd but transitioned to a longer-lasting focus on Black Lives Matter
How does the podcast ecosystem react to major events? As a case study, we consider collective attention in the podcast ecosystem following the murder of George Floyd in 2020...
A network figure where podcasts are connected by edges if they have hosted the same guest. Color is assigned based on self-ascribed podcast category labels. Layout is determined with the force-directed Yifan-Hu algorithm. Podcasts in the same category appear closer.
How do the creators of podcast content exchange ideas and form communities? We find that the Business, Sports, and News categories form communities through shared guests, whereas other large categories such as Religion and Society do not
A figure where podcast episodes are projected such that distance indicates topical similarity. Color is assigned based on the self-ascribed podcast category label.
Podcasts have categories, but how similar are podcasts within categories in terms of what they talk about? In our content analysis, we find it's mixed! Some topics belong to distinct categories—but other topics like "racial justice" or "spirituality" cut across many categories!
SPoRC covers nearly all English episodes during May-June 2020, with transcripts + host/guest inferences for over 1M episodes, and audio features + speaker turns for over 370K episodes. Using this data, we study the content, structure, and responsiveness of the podcast ecosystem