We built CopilotArena this fall in order to evaluate coding models in realistic, interactive environments.
Check out our recent writeup describing the results, as well as details of the system itself.
Work led by @waynechi.bsky.social and Valerie Chen.
.
Posts by Ameet Talwalkar
great to see more specialized ML conferences! Mega conferences are fun, but at least in my experience with MLSys, I've had much better scientific conversations at smaller ones.
Excited about L2G, led by Wenduo Cheng. We leverage LLMs to beat genomic FMs and strong supervised baselines on a wide range of benchmarks. L2G uses cross-modal transfer (rather than vanilla fine-tuning), and neural architecture search to learn a genomic-specific embedder model.
Can we bypass the resource bottleneck of pretraining genomic Foundation Models? Our work L2G repurposes language LLMs for genomics via cross-modal transfer, matching fine-tuned genomic FMs. Kudos to Wenduo & fantastic collab w/ @atalwalkar.bsky.social. L2G, language to genome; L2G, life’s too good!
Great writeup on the Chatbot Arena team, including a nice photo of @waynechi.bsky.social's back (in the purple shirt). It's been fun collaborating with this team via CoPilot Arena (blog.lmarena.ai/blog/2024/co...), and I'm super impressed with their hustle!
www.wsj.com/tech/ai/the-...
Check out @junhongshen1.bsky.social's blog post describing this project in more detail:
blog.ml.cmu.edu/2024/12/06/s...
Excited to share this work! This was a fun project in collaboration with Scribe, and a great example of the power of open-source FMs when coupled with rich domain-specific data!
Hi Willie! Could you add me?
Could I be added?
if you're a PhD student at CMU doing AI/ML, lmk if you want to be added to this starter pack.
(I don't belong in this list, but I don't know how to remove myself from this pack 😂)
go.bsky.app/9APVxQQ