LiteLong Enables Efficient Long-Context Data Synthesis for LLMs
LiteLong can generate up to 128 K-token training samples using hierarchical BISAC categories and a BM25-based retrieval, achieving competitive results on HELMET and Ruler benchmarks. getnews.me/litelong-enables-efficie... #litelong #longcontext #bisc
0
0
0
0