In the last blog post, we discussed the various facets of human behavior and its occurrence at multiple scales, ranging from subtle facial expressions to eye, limb or full-body movements.
In order to identify the underlying processes as well as the ultimate “driving forces” of human behavior, research has developed intricate techniques allowing for the collection of qualitative and quantitative measures that are indicative of an underlying personality trait, an emotional or cognitive state, or a specific problem solving strategy. This relationship is referred to as operationalization, which means that abstract or latent constructs such as intelligence, personality or performance can only be measured and quantified numerically by making them visible, i.e., breaking them into feasible and tangible, graspable and observable units.
For example, what are indicators of a person being a shopaholic? Measurable indicators might be the average time spent in department stores during a week, the cumulative amount of money laid out for certain lifestyle products, or the number of shoe boxes filling up the closet under the stairs.
While some measures might be more suitable to capture an underlying latent construct, others might fail. So the question is, what actually constitutes an appropriate measure? The three requirements that you might have encountered in academic or market research are objectivity, reliability, and validity. But what precisely do these criteria reflect?
Objectivity is the most general requirement and reflects the fact that measures should come to the same result no matter who is using them. Also, they should generate the same outcomes independent of the outside influences. For example, a multiple-choice personality questionnaire or survey is objective if it returns the same score irrelevant of whether the person completing the test is responding verbally or in written form. Further, the result should be independent of the knowledge or attitude of the tester, so that the results are purely driven by the performance of the participant.
A measure is said to have high reliability if it returns the same value under consistent conditions. There are several sub-categories of reliability. For example, “retest reliability” describes the stability of a measure over time, “inter-rater reliability” reflects the amount to which different raters give consistent estimates of the same behavior, while “split-half reliability” breaks a test into two and examines to what extent the two halves generate identical results.
This is the final and most crucial criterion. It reflects the extent to which a measure collects what it is supposed to collect. Imagine a personality test where people’s body size is measured. Obviously, the measure is both objective and reliable – it generates consistent results irrespective of the person taking the measurement – but it is a rather poor measure with respect to its construct validity (i.e., its capability to truly capture the desired underlying measure). A driving test where people are only requested to imagine driving does rather not reflect the real driving situation in a vehicle (i.e., it has rather poor ecological validity).
If you have identified measures that fulfill objectivity, reliability and validity criteria at the same time, you are on the right track to generate outcomes that will push beyond the frontiers of our existing knowledge. Being passionate about human behavior research, the iMotions team is looking forward to assisting you in accomplishing synchronized experimental stimulation, multimodal recording and online/offline annotation of diverse physiological and behavioral data streams (eye tracking, facial expressions, EEG, EMG, GSR etc.).
See high-quality behavior research at the University of Nebraska-Omaha and Stanford University based on our technology.
Find more information about technology for human behavior research.
If you are interested in a live demo, please click here.