Guessing that running BookNLP is part of the fun, but if you want to start w/output files, your friends at HTRC have run ~200k English-lang fiction vols through the pipeline already, and released all non-expressive data: htrc.atlassian.net/wiki/spaces/.... Unsure if any BSC vols are included though!
1 year ago
3
1
0
0