The BrowserGym Ecosystem for Web Agent Research
Thibault Le Sellier de Chezelles, Maxime Gasse, Alexandre Lacoste et al.
Action editor: Lingpeng Kong
https://openreview.net/forum?id=5298fKGmv3
#agentlab #agent #agents
New #Expert Certification:
The BrowserGym Ecosystem for Web Agent Research
Thibault Le Sellier de Chezelles, Maxime Gasse, Alexandre Lacoste et al.
https://openreview.net/forum?id=5298fKGmv3
#agentlab #agent #agents
Just found this cool blogpost discussing #AgentLab, #BrowserGym and #TapeAgent
medium.com/@carolynduby...
We’re really excited to release this large collaborative work for unifying web agent benchmarks under the same roof.
In this TMLR paper, we dive in-depth into #BrowserGym and #AgentLab. We also present some unexpected performances from Claude 3.5-Sonnet
Very excited to see this work coming out from @servicenowresearch.bsky.social. Can't wait to test a trained model in #AgentLab
AgentLab diagram. The image describes AgentLab, a framework for efficient parallel experiments with agents. It highlights: Core Agent Features: Dynamic Prompting and a Unified LLM API for interacting with large language models. BrowserGym Platform: A tool for testing agents on benchmarks like WebArena, WorkArena, MiniWoB, and others. Key Features: Reproducibility, a Unified Leaderboard, an analysis tool called Xray, and a Dataset for sharing agent traces. Blue elements represent AgentLab components.
🧵-1
We are thrilled to release #AgentLab, a new open-source package for developing and evaluating web agents. This builds on the new #BrowserGym package which supports 10 different benchmarks, including #WebArena.