Plan active. Agents deployed. 🐟
Thanks #TinyFish for the boost! We’re using the Web Agent API to power VeriPura with human-like navigation and anti-bot protection.
Next stop: Recording our demo and shipping for the #TinyFishAccelerator. Let’s build. 🛠️
#VeriPura #WebAgents
AIMindUpdate News!
Memory is the wall for web agents. AgentFold may have the solution. Learn more! #AgentFold #WebAgents #AI
Click here↓↓↓
aimindupdate.com/2025/11/16/a...
PolicyGuardBench Introduces Guardrails for Web Agent Policy Compliance
PolicyGuardBench releases a dataset of about 60,000 labeled web-agent trajectories and a lightweight guardrail model, PolicyGuard-4B, that detects policy violations with fast inference. getnews.me/policyguardbench-introdu... #policyguard #webagents
Evaluating Web Agent Reliability: Introducing the WAREX Benchmark
The new WAREX benchmark adds network jitter, TLS errors and security threats to suites like WebArena, WebVoyager and REAL. Under moderate jitter agents’ success rates fell below 50%. Read more: getnews.me/evaluating-web-agent-rel... #warex #webagents
FocusAgent Boosts Web Agent Efficiency by Trimming Large Page Contexts
FocusAgent trims web-agent observations by over 50% while keeping task success comparable to full-page baselines; its security variant reduces prompt-injection attacks. Read more: getnews.me/focusagent-boosts-web-ag... #focusagent #webagents
WAInjectBench: Benchmark for Detecting Prompt Injection in Web Agents
WAInjectBench, a benchmark for detecting prompt injection in web agents, provides a dataset with malicious text snippets and images, and benign controls. Code and data are on GitHub. getnews.me/wainjectbench-benchmark-... #promptinj #webagents
Fine-Grained Evaluation Framework Improves Reliability of AI Web Agents
Evaluation framework splits AI agents into perception, decision‑making, execution, verification stages to spot errors. Tested on SeeAct and Mind2Web, paper posted 17 September 2025. Read more: getnews.me/fine-grained-evaluation-... #ai #webagents
ReSum: Enhancing Long‑Horizon Web Search with Context Summarization
ReSum condenses LLM web‑agent dialogues into reasoning states, letting agents search. It yields a 4.5% boost over ReAct, rising to 8.2% with ReSum‑GRPO finetuning. Read more: getnews.me/resum-enhancing-long-hor... #webagents #summarization
#MagenticUI by #Microsoft
Human-centered web automation with a multi-agent system 🤖
#AI #automation #webagents #opensource #research #Python #Docker #AutoGen
🤝 Co-planning
Collaboratively create step-by-step plans using chat and a plan editor for transparent task execution.
🧵 👇
🔎 Nowy kurs DeepLearningAI to jak „Szkoła dla agentów AI”, tylko zamiast martini – Monte Carlo Tree Search, samokrytyka i DPO.
AgentQ uczy boty ogarniać przeglądarkę lepiej niż ja ogarniam zakładki.
➡️ deeplearning.ai/short-courses/building-ai-browser-agents
#AI #WebAgents
This is neat 🔥 I added my web agent to my bluesky profile and just passed a copy of my private key so it’s stored within the app and can trust it’s me.
I can load my P2P apps or I can get auto logged-in to my website’s WP admin.
Zero integration needed 🤓
#Agents #WebAgents