
VGBench: Evaluating Vision-Language Models in Real-Time Gaming Environments
Introduction
Vision-Language Models (VLMs) have achieved remarkable success in tasks such as coding and mathematical reasoning, often surpassing human performance. However, their ability to perform tasks that require human-like perception, spatial navigation, and memory management remains underexplored. To address this gap, the paper titled "VideoGameBench: Can Vision-Language Models complete