@qualcomm.bsky.social @machinelearning.bsky.social
Posts by Ramchalam K R
Excited to announce that our Paper from #Qualcomm Canada has been accepted at #NeurIPS2025 OmniDraft: A Cross-vocabulary, Online Adaptive Drafter for On-device Speculative Decoding.
Looking forward to sharing our work at NeurIPS 2025. @neuripsconf.bsky.social
Preprint - arxiv.org/abs/2507.02659
Thanks everyone for the wonderful and engaging discussions during our poster sessions and live demo of our work - Stepping Forward On The Last Mile. Thank you ๐.
arxiv.org/html/2411.04...
@neuripsconf.bsky.social
@qualcomm.bsky.social
Gosh. Thanks I got confused if this was another pack. Apologies:)
I'd like to be added. Thanks
Alright thank you for the clarification.
I did work on structured pruning on weights a few years ago and as we were focused on deployment to edge devices , it was critical. But this approach on the activation/attention head is interesting although the inference graph wouldn't really change on the base model. Would love to further discuss.
I do have a few points of clarification. Maybe I will drop by on the poster session.
But the stage 1 looks to me like it's structured pruning on the activation. What I am curious about is does this approach help improving inference ? So I presume we won't need to do compute for certain heads.
Very interesting work on LoFiT!
I'd like to be added to the pack as I would be at NeurIPS 2024 as well. Thanks
@neuripsconf.bsky.social @qualcomm.bsky.social
Qualcomm at NeurIPS 2024: Our groundbreaking innovations and cutting-edge advancements in AI.
www.qualcomm.com/news/onq/202...
If you're attending NeurIPS 2024 in Vancouver, be sure to visit us at Qualcomm's booth #533 and our poster(#6102) on Wednesday - Stepping forward on the last mile.
@neuripsconf.bsky.social
Our work at Qualcomm AI research, Qualcomm Canada was accepted at NeurIPS 2024. This work presents on-device training for any network (transformer, convolutional or rnn architecture) using fixed points(quantized) forward gradients (no back-prop).
Paper link: arxiv.org/html/2411.04...
#Neurips2024