Advertisement ยท 728 ร— 90

Posts by Yining Lu

Preview
GitHub - yining610/Reliable-dRAG: Official repo for the paper "A Decentralized Retrieval Augmented Generation System with Source Reliabilities Secured on Blockchain" Official repo for the paper "A Decentralized Retrieval Augmented Generation System with Source Reliabilities Secured on Blockchain" - yining610/Reliable-dRAG

๐Ÿš€ one-line command for easy deployment: github.com/yining610/Re...

5 months ago 0 0 0 0

๐Ÿ“ฃ My first system paper ๐Ÿ“ฃ We built a decentralized RAG system that solves data reliability challenges in real-world settings. The sources provided by each data owner will be securely managed and scored on the blockchain.

๐Ÿ”— Paper link: arxiv.org/abs/2511.07577

5 months ago 0 0 1 0

Work done during an internship at @amazon. Huge thanks to my mentor, @zlwang_cs, and advisor, @Meng_CS, for their support in making this work possible, and to collaborators @ShiyangLi5, Xin Liu, Changlong Yu, @YinQingyu, Zhan Shi, and @zhangzxUIUC for their valuable feedback!

7 months ago 0 0 0 0
Post image

8/8 [Convergence rate]
The gradient-based method consistently has a higher convergence rate, reducing the required steps by 6.1 on average across RL algorithms.

7 months ago 0 0 1 0
Post image

7/8 [Generalizability]
We further extend experiments to different math datasets and model families. Our two methods yield superior Pareto fronts compared to the baseline, with the gradient-based weighting showing the best overall performance.

7 months ago 0 0 1 0
Post image

6/8 [Gradient-based weight optimization]
Our method generates superior Pareto fronts that dominate all baseline approaches under both GRPO and REINFORCE training.

7 months ago 0 0 1 0
Post image

5/8 [Hypervolume-guided weight adaptation]
Across all three online RL algorithms, there is consistently at least one weight configuration our method outperforms the baselines on all objectives.

7 months ago 0 0 1 0
Post image

Dynamic reward weights show objectives learn differently. For example, accuracy is a more challenging objective that requires continual learning, while conciseness quickly converges to 0.2.

4/8

7 months ago 0 0 1 0
Advertisement
Post image

3/8 [Preliminary finding]
Different objectives vary in learning difficulty. Each objective reaches saturation at different training stages.

7 months ago 0 0 1 0

Question: How to redirect learning effort towards objectives with the greatest potential for improvement.

Answer:
- If the user preference for objectives is given, use our hypervolume-based method
- If the user preference is unknown, use our gradient-based method.
2/8

7 months ago 0 0 1 0
Post image

โœด๏ธ Pleased to introduce our new paper yining610.github.io/dynamic-rew...

- Rebalance multiobjectives during training through dynamic reward weighting
- Build Pareto-dominant front over static baselines across online RL algorithms, datasets, and model families
- Faster convergence rate

1/8

7 months ago 0 0 1 0
ACL2025: Optimizing Decomposition for Optimal Claim Verification
ACL2025: Optimizing Decomposition for Optimal Claim Verification

This is our teaser video ๐Ÿ˜€
youtu.be/TgloG4Oefeg

8 months ago 0 0 0 0
Post image

Can't make it to #ACL2025 this year, but for people interested in RL for factuality and textual decomposition, please check out our paper!

TL;DR: We found a mismatch between the decomposition policy and LLM verifier, and propose a dynamic training paradigm to bridge the gap.

8 months ago 1 0 1 0
Preview
Optimizing Decomposition for Optimal Claim Verification Current research on the \textit{Decompose-Then-Verify} paradigm for evaluating the factuality of long-form text typically treats decomposition and verification in isolation, overlooking their interact...

Pleased to share that two papers were accepted to #ACL2025 main! Huge congratulations to all collaborators for the hard work and time we put in together!

1. Dynamic Decomposition: arxiv.org/abs/2503.15354
2. RATIONALYST: arxiv.org/abs/2410.01044

Both works study the mulit-model collobration!

11 months ago 0 0 0 0
Post image

Quick reminder that our paper, Benchmarking Language Model Creativity: A Case Study on Code Generation, will be presented today!

๐Ÿ“… 11AM-12:30PM, Fri, May 2
๐Ÿ“ Hall 3
๐Ÿ“ arxiv.org/abs/2407.09007
๐ŸŽฅ www.youtube.com/watch?v=v1c...

11 months ago 0 0 0 0

Highlighting our #NAACL2025 papers ๐Ÿงต๐Ÿงต๐Ÿงต

11 months ago 1 1 1 0

I will be at #NAACL2025 to present our LLM creativity benchmark. Drop by if interested (Poster Session 8, Fri, May 2)!

I'd love to chat about RL and its interpretability, data influence for post-training, CogSci for LLM. Feel free to reach out and let's have some coffee together โ˜• !

11 months ago 2 1 0 0
Advertisement
Benchmarking Language Model Creativity: A Case Study on Code Generation --- NAACL 2025 (Yining Lu)
Benchmarking Language Model Creativity: A Case Study on Code Generation --- NAACL 2025 (Yining Lu) Yining Lu: https://yining610.github.io/ Based on the following paper: https://arxiv.org/abs/2407.09007 As LLMs become increasingly prevalent, it is interesti...

A video teaser of @Yining__Lu 's paper:
www.youtube.com/watch?v=v1c...

11 months ago 1 1 1 0
Midwest Speech and Language Days 2025

Midwest Speech and Language Days will be held Apr 15-16 at
@NotreDame! Abstract submissions are due Mar 20, and registration deadline is Mar 27. Financial assistance for students (lodging, poster printing) is available. nlp.nd.edu/msld25

1 year ago 0 2 1 0

A starter pack for #NLP #NLProc researchers! ๐ŸŽ‰

go.bsky.app/SngwGeS

1 year ago 251 99 45 13