Gaze Attention Estimation for Medical Environments

Natchapol Shinno

Yuki Furuya

Takeshi Saitoh

Haibo Zhang

Keiko Tsuchiya

Hitoshi Sato

Frank Coffey

Gaze attention estimation is the task that aims to understand where each person is looking in each scene. In this study, we introduce a new annotated dataset that is derived from medical simulation training videos, capturing diverse and authentic clinical scenarios from a practical medical environment and annotated by the ground truth data from eye-tracking devices (iMotions [1]) worn by medical professionals during procedures in the scenes. Most of the existing approaches in the field of gaze prediction often rely on object detection as a guide for the model. However, this becomes problematic when encountering specialised tools and equipment in medical environments, and that domain is probably absent from the standard large-scale object detection and segmentation dataset. To address the problem, this paper attempts to propose a gaze prediction framework that integrates with the head pose information, which consists of pitch, yaw, and roll. Enable the model to rely on gaze direction even when objects are not detected. This approach is based on the self-attention mechanism of vision transformers. We believe this will enhance the model’s performance in relation to the relationship between gaze direction and the scene. We hope to offer a more reliable framework for real-world medical applications.

This publication uses Eye Tracking and Eye Tracking Glasses which is fully integrated into iMotions Lab

Learn more

Learn more about the technologies used

Other publications you might be interested in