Gaze attention estimation is the task that aims to understand where each person is looking in each scene. In this study, we introduce a new annotated dataset that is derived from medical simulation training videos, capturing diverse and authentic clinical scenarios from a practical medical environment and annotated by the ground truth data from eye-tracking devices (iMotions [1]) worn by medical professionals during procedures in the scenes. Most of the existing approaches in the field of gaze prediction often rely on object detection as a guide for the model. However, this becomes problematic when encountering specialised tools and equipment in medical environments, and that domain is probably absent from the standard large-scale object detection and segmentation dataset. To address the problem, this paper attempts to propose a gaze prediction framework that integrates with the head pose information, which consists of pitch, yaw, and roll. Enable the model to rely on gaze direction even when objects are not detected. This approach is based on the self-attention mechanism of vision transformers. We believe this will enhance the model’s performance in relation to the relationship between gaze direction and the scene. We hope to offer a more reliable framework for real-world medical applications.
Related Posts
-
Your Menu Is Your Most Powerful Marketing Asset
Consumer Insights
-
Measuring Pain: Advancing The Understanding Of Pain Measurement Through Multimodal Assessment
Ergonomics
-
Feeling at Home: How to Design a Space Where the Brain can Relax
Ergonomics
-
Why Dial Testing Alone Isn’t Enough in Media Testing – How to Build on It for Better Results
Consumer Insights