Hubble Suite logo (cloth patch with names of key organizations involved: USC, MPI, NVIDIA)
Announcing 🔭Hubble, a suite of open-source LLMs to advance the study of memorization!
Pretrained 1B/8B param models, with controlled insertion of texts designed to emulate key memorization risks: copyright (e.g., book passages), privacy (e.g., synthetic biographies), and test set contamination