How accurate is Webcam Eye Tracking? This is a question that gets more and more topical as software that utilizes webcams for facial- and eye tracking become more and more sophisticated. Last year, iMotions launched its new cloud-based “Online Data Collection” module, featuring several new tools that enable researchers to conduct their studies in the homes of their respondents through their personal computers. One of the most anticipated tools in that module is the option to conduct eye tracking studies through respondents’ personal webcams. This is a break from the traditional way of conducting eye tracking research because usually conducting biosensor research requires a lot of hardware. That is both one of the benefits and one of the drawbacks of working in the field of human behavior research. On the one hand, you work on the cutting edge of technological development, but on the other hand, cutting-edge tech often comes with a price tag to match. If you want to be able to conduct the most accurate research, you have to make an investment that compliments your research aspirations. That will most likely be true for many years to come. But, that is not to say that a significant investment is a requirement to conduct any research – far from it.
Webcam eye tracking is integrated into our online data collection solution, and whereas traditional eye tracking research is quite bound to the lab, online data collection was developed to be the exact opposite. The central feature of the solution, which we will get into later in this article, is scalability. Favoring scalability requires a few sacrifices in terms of accuracy and overall data quality, but for many research applications, quantitative data collection can be more valuable than a few very precise studies.
While nominally covering the same ground in terms of data collection, screen-based and webcam eye tracking are very different beyond that. Firstly, it is important to note that dedicated eye tracking hardware is expensive for a reason, and in this case, price does follow quality. So, before you purchase your eye tracking hardware and software, it is important to understand the different strengths and weaknesses of both screen-based and webcam-based eye tracking.
Screen-Based Eye Tracking: Pros & Cons
- + Data Quality: Accuracy. Eye tracking accuracy means the spatial offset of the data, i.e. how well the eye tracker is at measuring where exactly you were looking. This is often expressed in degrees visual angle, and the smaller this value, the better your eye tracker.
- + Data Quality: Sampling rate. The sampling rate indicates how fast your eye tracker is, and its temporal resolution at capturing eye movements. Dedicated eye trackers can have a sample rate from 30hz up to 1200 Hz which means that they can track a respondents’ eye movement once every 33 – 0.8 ms. As a rule of thumb, your sampling rate needs to be twice as high as the signal you want to capture. If you are only interested in fixations, so “where people are looking”, then a low sampling rate is totally fine. For basic eye movement research or psychophysics, one typically uses rather high sampling rates.
- + More than only gaze data: Dedicated eye trackers do not only measure the gaze, but also give data on pupil dilation, distance to the screen, and (for some systems) even head position or blinks.
- + Data collection in changing conditions. Because of their infrared cameras, dedicated eye tracking hardware is able to capture data while allowing the environment to change some and they even work in the dark. A certain amount of head movement and light change can still produce usable data.
- – Rigid setup. Dedicated eye tracking hardware only works in a “one eye tracker/one station” setup, which means that you will need one computer with the iMotions Software Suite installed to run one eye tracker.
- – Data Collection Scalability. Scaling a data collection effort – i.e a research lab – will be costly in one of two ways. Either you can choose to scale up the number of invited respondents or invest in more lab stations. That will either be costly in resources and time and the other will be costly in money.
Webcam Eye Tracking: Pros & Cons
- + Data Collection Scalability. Everyone with a functioning webcam can potentially join your study from anywhere in the world, which makes quantitative data collection much more convenient.
- + Price. By purchasing the iMotions core software and Online Data Collection modules you can conduct studies using only a computer capable of running the software. No additional hardware is required – aside from your respondents’ webcams.
- – Overall data quality. It stands to reason that the data quality will reflect the hardware used. It is important not to expect a standard consumer-grade webcam to produce data of the quality and accuracy that a piece of dedicated hardware can do
- – Rigid environment. Webcam eye tracking needs much more consistent surroundings to produce data of sufficient quality than a dedicated eye tracker. Head movement should be minimal to none, light sources should be ample and consistent, and the internet connection stable.
Webcam eye tracking accuracy in practice.
In order to ascertain how webcam-based eye tracking works in practice, we ran an in-house accuracy test where we compared our webcam algorithm against the industry gold standard, of screen-based eye tracking. We did this to see how the webcam algorithm would fare when compared to the data quality from the most precise screen-based eye tracker on the market because no matter how unfair that comparison might be, we would never send something to the market that we could not vouch for.
As stated above, the study set out to measure the accuracy of, and subsequently the ideal conditions of use for, the webcam eye tracking algorithm. Below we go through the various conditions of the study setup, but if you want to study the findings of the study in detail you can download the whitepaper here.
Stimuli were presented on a 22” computer screen in a dimly lit room. Respondents were sitting in front of a neutral grey wall and at a distance of 65 cm t
o the web camera, and a reading lamp illuminated the respondents’ faces from the front. Web camera data was collected with a Logitech Brio camera sampling at 30 Hz with a resolution of 1920×1080 px. Simultaneously, screen-based eye tracking data was collected with a top-of-the-line screen-based eye tracker without a chinrest. Respondents were instructed to sit perfectly still and not to talk.
Aside from the ideal conditions described above, four extra conditions were tested – participants wearing glasses, a low web camera resolution, suboptimal face illumination, or having the respondent move and talk.
Under the most ideal conditions, without any manipulations (glasses, low-res webcam, low facial illumination, and head movement), the webET had an average accuracy offset of 4.3 dva (for the screen-based eye tracker data, average accuracy was 0.6 dva). WebET data from a fourth of all trials had average offsets larger than 5 dva.
Note: “dva” is short for dynamic visual acuity, which is the ability to resolve fine spatial detail in dynamic objects during head fixation, or in static objects during head or body rotation.
Of all the extra conditions, head and body movement had the largest impact on webET accuracy. Whilst data from the screen-based eye tracker confirmed that respondents correctly maintained their gaze on the targets (and average accuracy was 0.7 dva), the webET algorithm succeeded to calculate gaze data for 98% of the trials with moving respondents with an average offset of 7.1 dva. 61% of the trials had an offset larger than 5 dva.
This means that it is absolutely crucial that respondents using the webcam eye tracker must sit absolutely still during studies in order to provide usable data.
Strong sidelight (light from a window, backlight, etc) also caused larger data offsets with an average accuracy of 5.6 dva for webET (for the screen-based eye tracker, average accuracy was 0.6 dva) and around 57% of trials had average offsets of more than 5 dva for webET data.
Lower camera resolutions also caused some, but not as pronounced, increase in data offsets with an average accuracy of 5.0 dva (for the screen-based eye tracker, average accuracy was 0.6 dva) and a third of trials showing an average offset of webET data above 5 dva.
For the 5 respondents who were re-recorded wearing glasses, the largest average offset of 7.5 dva was observed for webET (for the screen-based eye tracker, average accuracy was 0.9 dva) and 75% of the trials had an average offset of webET data higher than 5 dva.
What this study shows is that the accuracy of webcam eye tracking is negatively affected by any combination of the aforementioned distorting factors (moving, talking, bad lighting, and wearing glasses). When ideal conditions are met, however, webcam eye tracking showed good data consistency and viable data quality. This shows with clarity that, in order to optimally employ webcam eye tracking in research, respondents must adhere to all instructions of best practices from the study organizer in order to ensure the best quality of data.
Even though the data quality of webcam eye tracking is not really comparable to dedicated eye tracking hardware, it is the perfect option for scaling your research. We like to think of it as making our clients amongst the first to be able to conduct quantitative human behavior research.
If you plan on conducting continuous eye tracking research where accuracy and precision are key, then investing in proper hardware is well worth it. But if you are setting out to conduct bulk UX testing, A/B testing, or image/video studies with eye tracking we are certain that this new feature will be a valuable tool for you.
If you are interested to know more about how webcam eye tracking can help you scale your research and reach a global audience through the online data collection platform, please go to our iMotions Online page here: