Advertisement · 728 × 90
#
Hashtag
#CommonPile
Advertisement · 728 × 90
Preview
Paper page - The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Join the discussion on this paper page

Really happy to see a new #copyleft -based #LLM , and this one seems to be more general-purpose than former attempts such as #PleIAs. The #Comma model is trained with #CommonPile, a new training pile with 8 TB of public domain and copyleft data. huggingface.co/papers/2506.052…

0 1 0 0
Preview
Paper page - The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Join the discussion on this paper page

Really happy to see a new #copyleft -based #LLM , and this one seems to be more general-purpose than former attempts such as #PleIAs. The #Comma model is trained with #CommonPile, a new training pile with 8 TB of public domain and copyleft data. huggingface.co/papers/2506.052…

0 1 0 0