Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) π
We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning π§΅β¬οΈ
Posts by Max Bartolo
Massive shoutout to all our fantastic contributors, collaborators and partners who made this possible! π
Model weights are available for research purposes at:
π Command A: huggingface.co/CohereForAI/...
πCommand R7B: huggingface.co/CohereForAI/...
π You can find the full tech report at cohere.com/research/pap...
I'm excited to share the tech report for our @cohere.com @cohereforai.bsky.social Command A and Command R7B models. We highlight our novel approach to model training including self-refinement algorithms and model merging techniques at scale. Read more below! β¬οΈ
I really enjoyed my MLST chat with Tim @neuripsconf.bsky.social about the research we've been doing on reasoning, robustness and human feedback. If you have an hour to spare and are interested in AI robustness, it may be worth a listen π§
Check it out at youtu.be/DL7qwmWWk88?...
That's very cool! There's definitely a lot happening in the space and most people are doing some version of this, but I haven't come across a well-organised collection of tools like this yet -- could be quite impactful!
Check out @lisaalaz.bsky.social's internship work with us @cohere.com questioning the rationale behind rationales π₯
Super excited to see PRISM recognised as a #NeurIPS2024 best paper. This was an incredible large-scale effort by @hannahrosekirk.bsky.social and fantastic collaborators. If you're interested in human feedback, check it out, there are 100+ pages of detailed insights! π₯
Our paper PRISM alignment won a best paper award at #neurips2024!
All credits to @hannahrosekirk.bsky.social A.Whitefield, P.RΓΆttger, A.M.Bean, K.Margatina, R.Mosquera-Gomez, J.Ciro, @maxbartolo.bsky.social H.He, B.Vidgen, S.Hale
Catch Hannah tomorrow at neurips.cc/virtual/2024/poster/97804
Excited to reveal Genie 2, our most capable foundation world model that, given a single prompt image, can generate an endless variety of action-controllable, playable 3D worlds. Fantastic cross-team effort by the Open-Endedness Team and many other teams at Google DeepMind! π§
Looking forward to @neuripsconf.bsky.social #NeurIPS #NeurIPS2024 in Vancouver next week! βοΈ
Reach out (or pop by the @cohere.com booth) if you want to chat about human feedback, robustness and reasoning, prompt optimisation, adversarial data, glitch tokens, evaluation, or anything else!
Couldn't agree with you more, Laura is incredible!
Sparks of multi-hop reasoning β¨
Fun to see Douwe's Dynabench plot continue to inspire new groundbreaking benchmarking work!
Awesome, thanks!
@mariaa.bsky.social I'm new here so apologies if this is a noob question, but is there a way I can recommend folks to be added to starter packs?
π¨ LLMs can learn to reason from procedural knowledge in pretraining data! π¨ I particularly enjoy research where the evidence contradicts our initial hypothesis. If you're interested in LLM reasoning, check out the 60+ pages of in-depth work at arxiv.org/abs/2411.12580
We launched Judge Arena with @huggingface.bsky.social
@clefourrier.bsky.social - a platform that lets you easily compare models as judges side-by-side and vote for the best evaluation
Check out the live leaderboard and start voting now π€