A few months ago, iMotions launched its new cloud-based Online Data Collection Module with several new tools that enable researchers to conduct their studies in the homes of their respondents through their personal computers. One of the most anticipated tools in that module is the option to conduct eye tracking studies through respondents’ personal webcams. This is a break from the traditional way of conducting eye tracking research because usually conducting biosensor research requires a lot of hardware. That is both one of the benefits and one of the drawbacks of working in the field of human behavior research. On the one hand, you work on the cutting edge of technological development, but on the other hand, cutting-edge tech often comes with a price tag to match. If you want to be able to conduct the most accurate research, you have to make an investment that compliments your research aspirations. That will most likely be true for many years to come. But, that is not to say that a significant investment is a requirement to conduct any research – far from it.

Webcam eye tracking is integrated into our Online Data Collection (ODC) module. And whereas traditional eye tracking research is quite bound to the lab, the ODC was developed to be the exact opposite. The central feature of the ODC module, which we will get into later in this article, is scalability. Favoring scalability requires a few sacrifices in terms of accuracy and overall data quality, but for many research applications quantitative data collection can be more valuable than few very precise studies.

While nominally covering the same ground in terms of data collection, screen-based and webcam eye tracking are very different beyond that. Firstly, it is important to note that dedicated eye tracking hardware is expensive for a reason, and in this case, price does follow quality. So, before you purchase your eye tracking hardware and software, it is important to understand the different strengths and weaknesses of both screen-based and webcam-based eye tracking.

woman playing a video game in front of a screen with an eye tracking webcam

Screen-Based Eye Tracking:  Pros & Cons

  • + Data Quality: Accuracy. Eye tracking accuracy means the spatial offset of the data, i.e. how well the eye tracker is at measuring where exactly you were looking. This is often expressed in degrees visual angle, and the smaller this value, the better your eye tracker.
  • + Data Quality: Sampling rate. The sampling rate indicates how fast your eye tracker is, and its temporal resolution at capturing eye movements. Dedicated eye trackers can have a sample rate from 30hz up to 1200 Hz which means that they can track a respondents’ eye movement once every 33 – 0.8 ms. As a rule of thumb, your sampling rate needs to be twice as high as the signal you want to capture. If you are only interested in fixations, so “where people are looking”, then a low sampling rate is totally fine. For basic eye movement research or psychophysics, one typically uses rather high sampling rates.
  • + More than only gaze data: Dedicated eye trackers do not only measure the gaze, but also give data on the pupil dilation, distance to the screen, and (for some systems) even head position or blinks.
  • + Data collection in changing conditions. Because of their infrared cameras, dedicated eye tracking hardware is able to capture data while allowing the environment to change some and they even work in the dark. A certain amount of head movement and light change can still produce useable data.
  • Rigid setup. Dedicated eye tracking hardware only works in a “one eye tracker/one station” setup, which means that you will need one computer with the iMotions Software Suite installed to run one eye tracker.
  • Data Collection Scalability. Scaling a data collection effort – i.e a research lab – will be costly in one of two ways. Either you can choose to scale up the number of invited respondents or invest in more lab stations. That will either be costly in resources and time and the other will be costly in money.

Webcam Eye Tracking: Pros & Cons

  • + Data Collection Scalability. Everyone with a functioning webcam can potentially join your study from anywhere in the world, which makes quantitative data collection much more convenient.
  • + Price. By purchasing the iMotions core software and Online Data Collection modules you can conduct studies using only a computer capable of running the software. No additional hardware is required – aside from your respondents’ webcams.
  • Overall data quality. It stands to reason that the data quality will reflect the hardware used. It is important not to expect a standard consumer-grade webcam to produce data of the quality and accuracy that a piece of dedicated hardware can do
  • Rigid environment. Webcam eye tracking needs much more consistent surroundings to produce data of sufficient quality than a dedicated eye tracker. Head movement should be minimal to none, light sources should be ample and consistent, and the internet connection stable.

abstract webcam icon yellow lines on transparent background

 

Webcam eye tracking accuracy in practice.

In order to ascertain how webcam-based eye tracking works in practice, we ran an in-house accuracy test where we compared our webcam algorithm against the industry gold standard, the Tobii Pro Spectrum. We did this to see how the webcam algorithm would fare when compared to the data quality from the most precise screen-based eye tracker on the market because no matter how unfair that comparison might be, we would never send something to market that we could not vouch for.

Study findings.

As stated above, the study set out to measure the accuracy of, and subsequently the ideal conditions of use for, the webcam eye tracking algorithm. Below we go through the various conditions of the study setup, but if you want to study the findings of the study in detail you can download the whitepaper here.

Methodology

Stimuli were presented on a 22” computer screen in a dimly lit room. Respondents were sitting in front of a neutral grey wall and at a distance of 65 cm to the web camera, and a reading lamp illuminated the respondents’ faces from the front. Web camera data was collected with a Logitech Brio camera sampling at 30 Hz with a resolution of 1920×1080 px. Simultaneously, screen-based eye tracking data was collected with a Tobii pro Spectrum sampling at 300 Hz without chinrest. Respondents were instructed to sit perfectly still and not to talk.

Aside from the ideal conditions described above, four extra conditions were tested – participants wearing glasses, a low web camera resolution, suboptimal face illumination, or having the respondent move and talk.

Ideal Conditions

Under the most ideal conditions, without any manipulations (glasses, low-res webcam, low facial illumination, and head movement), the webET had an average accuracy offset of 4.3 dva (for Tobii pro Spectrum data, average accuracy was 0.6 dva). WebET data from a fourth of all trials had average offsets larger than 5 dva.

Note: “dva” is short for dynamic visual acuity, which is the ability to resolve fine spatial detail in dynamic objects during head fixation, or in static objects during head or body rotation.

Webcam ET Accuracy 2

Movement

Of all the extra conditions, head and body movement had the largest impact on webET accuracy. Whilst data from Tobii pro Spectrum confirmed that respondents correctly maintained their gaze on the targets (and average accuracy was 0.7 dva), the webET algorithm failed to calculate gaze data for half of the trials. The remaining webET data had an offset of 19 dva and close to 90% of all trials had average offsets larger than 5 dva.

This means that it is absolutely crucial that respondents using the webcam eye tracker must sit absolutely still during studies in order to provide usable data.

Sidelight

Strong sidelight (light from a window, backlight, etc) also caused larger data offsets with an average accuracy of 7.5 dva for webET (for Tobii Spectrum, average accuracy was 0.6 dva) and around 70% of trials had average offsets of more than 5 dva for webET data.

Low resolution

Lower camera resolutions also caused some, but not as pronounced, increase in data offsets with an average accuracy of 5.4 dva (for Tobii Spectrum, average accuracy was 0.6 dva) and a third of trials showing an average offset of webET data above 5 dva.

Glasses

For the 5 respondents who were re-recorded wearing glasses, a small average offset of 4.7 dva was observed for webET (for Tobii Spectrum, average accuracy was 0.9 dva) and a quarter of the trials had an average offset of webET data higher than 5 dva.

WebET Accuracy 1

Conclusion

What this study shows is that the accuracy of webcam eye tracking is negatively affected by any combination of the aforementioned distorting factors (moving, talking, bad lighting, and wearing glasses). When ideal conditions are met, however, webcam eye tracking showed good data consistency and viable data quality. This shows with clarity that, in order to optimally employ webcam eye tracking in research, respondents must adhere to all instructions of best practice from the study organizer in order to ensure the best quality of data.

Even though the data quality of webcam eye tracking is not really comparable to dedicated eye tracking hardware, it is the perfect option for scaling your research. We like to think of it as making our clients amongst the first to be able to conduct quantitative human behavior research.

If you plan on conducting continuous eye tracking research where accuracy and precision are key, then investing in proper hardware is well worth it. But if you are setting out to conduct bulk UX testing, A/B testing, or image/video studies with eye tracking we are certain that this new feature will be a valuable tool to you.

If you are interested to know more about how webcam eye tracking can help you scale your research and reach a global audience through the online data collection platform, please go to our Online Data Collection page here:

iMotions Online Data Collection Module