VisualAgentBench (VAB) is the first benchmark designed to systematically evaluate and develop large multi models (LMMs) as visual foundation agents, which comprises 5 distinct environments across 3 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results