Abstract: With their prominent scene understanding and reasoning capabilities, pre-trained visual-language models (VLMs) such as GPT-4V have attracted increasing attention in robotic task planning.
Preview of new companion app allows developers to run multiple agent sessions in parallel across multiple repos and iterate ...