Call the right play

Charles Young, Ameritest – This article introduces a moment-by-moment picture-sorting technique for measuring the emotional content of TV commercials. This new dynamic apprach to analyzing the dramatic structure of commercials is shown to be a valid predictor of purchase intent.

Socrates tells us in the Phaedrus that “Oratory is the art of Enchanting the Soul.” He describes the soul as a chariot pulled by two horses: the white horse of reason and the dark horse of human emotion. This first description of what, in its modern incarnation, is advertising has framed a debate that occupies us today—the dual roles of reason and emotion in the art of advertising. Since Socrates also counsels us that the “eye is the most penetrating of all the senses,” we focus our attention on the oratory of our age, television commercials.

At issue is not whether emotion in advertising matters—as far as we’re concerned we’ll take Socrates’ word for it. The research problem we would like to address is one of measurement. For advertising research practitioners, we want to know the answer to a simple question. What is a valid and practical way to measure the emotional content of a television commercial that would provide useful insights for our clients into the differences between effective and ineffective advertising?

Advertising researchers, of course, purport to measure how enchanting advertising is with a variety of measurement constructs: related recall, attention-getting power, liking, purchase intent, and persuasion. The role the dark horse plays in determining how a television commercial scores on each of these research report card measures has been explored many times before. Therefore, before suggesting a new approach to how the emotion in an ad might be measured, we should first briefly review what others have said on the subject.

Emotion and Report Card Measures

Using an interesting scheme for classifying 168 television commercials into emotional, mood, humorous, or emotionally neutral types, Walker (1988) found no positive correlation between emotional or mood commercials and related recall as measured by the ASI system, acknowledging the classification scheme he was using was not actually based on measurement of the emotion in the ads in his sample. Indeed, on the subject of recall, Zielski (1982) reported that day after recall scores may actually penalize emotional advertising.

In one of the most extensive reviews of in-market sales effects conducted to date, Lodish et al. (1995) examined 389 split cable test markets and found no evidence of a relationship between related recall scores and sales effects for either new products or established brands. It appears that the widely used report card measure of day after recall does not capture the potential sales impact of emotion in advertising.

On the positive side, the famous ARF Copy Validity Study (Haley and Balinger, 1991) found the strongest overall predictor of sales among the major copytesting measures to be liking of the advertising. More recent research has attempted to understand the relationship between liking and other pre-testing measures. For example, in their analysis of ASI copytesting measures, Walker and Dubitsky (1994) found small positive correlations between liking and recall and liking and pre/post persuasion, but stronger correlations between liking and attention and liking and purchase intent. They also reported high correlations between liking and diagnostic measures of entertainment value and message relevance.

Youn et al. (2001) found that the degree of correlation between recall and liking varied significantly as a function of the product category being advertised. Finally, in a sample of 60 commercials taken from the Unilever home and personal care database, Kastenholz and Young (2003) found a strong negative correlation between liking and related recall, but also a strong positive correlation with attention and purchase intent.

Ad liking has also been measured on a continuous basis. For example, using a self-response meter with a five-point liking scale, Spaeth et al. (1990) found a significant relationship between moment-by-moment ad liking and sales for five direct response television commercials. Using the same system, Polfuss (1991) found that “hot spots” in the liking trace curve, or the few seconds of an ad when key selling points are delivered, were the most predictive of sales for ten direct response commercials.

Taken together, these findings demonstrate that emotion – narrowly defined here either as a static measure of general affect or a moment-by-moment affect measure – has a role to play both in terms of short-term sales effects and long-term brand-building effects.

From Simple Ad Liking to Complex Dynamic Emotionality

The theory behind our new approach to emotional measurement builds on a great body of work by experimental psychologists (see, for example, Hoffman, 1998; Anderson, 2000; Leahy and Harris, 2001) who have demonstrated that cognitive and perceptual processes are active, selective, and reconstructive, and are affected equally by the consumer’s knowledge of the world as well as her past and current experience. In short, this relatively new paradigm contrasts dramatically with the received view of the consumer as a passive device such as a camera that records everything put in front of it. Accordingly, our focus is on the journey consumers take in creating their understanding of a commercial. On a macro level, our new approach helps us to identify the perceived dramatic structure of the commercial. On a micro level, this approach pinpoints the sometimes fleeting moments of the commercial that either resonate emotionally with the consumer, or alternatively, alienate her.

More practically, this new assessment of emotionality (i.e., Ameritest’s Flow of Emotion®) is an extension of a picture- or card-sorting technique developed to assess attention (i.e., Ameritest’s Flow of Attention®). Specifically, this general sorting technique – the basis for both types of measures – works by creating a visual vocabulary that allows us to probe respondent reactions to a commercial without resorting to words or verbal language.

Importantly, from the standpoint of creating a fair test of the visual advertising experience, this sorting deck comprises the most natural and culture-free language we can think of for describing a viewer’s experience of an ad, because the images are derived from the symbol system of the commercial itself (see, for example, Young, 2003). Please also note that this symbol system is a more direct representation of the commercial experience and complements the linguistic system (e.g., verbal descriptions of what was said and shown) that would be typically used in advertising research. Finally, this technique is minimally invasive because it occurs after forced exposure to the commercial.

Theoretical Differences in Moment-by-Moment Approaches

There are a number of important differences between our picture sort approach to measuring a consumer’s moment-by-moment experience of a television commercial and other widely-used response meter approaches.

First of all, the picture sort approach, as used in our standard pre-testing system, is multi-dimensional. As we will see below, a minimum of two dimensions are needed to understand the performance of commercials as a whole on criteria such as attention-getting power, memory, and persuasion. Response meter measures tend to be used in a one dimensional way, collecting moment-by-moment ratings on liking, for example, which may limit their ability to explain the sometimes conflicting signals given by multi-dimensional report card measures. When interview time permits, we have found that collecting picture sort information on a third or even a fourth dimension can be quite useful diagnostically.

Continuing in this deconstructionist vein, the most obvious characteristic of our picture sort approach is that it separates the visual channel from the verbal. It should be noted that, in parallel with picture sort data, key copy point sorts are also collected in the pretest in order to gain a more complete picture of the advertising experience. The ability to separate the audience response into two sensory channels allows us to diagnostically understand interaction effects between copy and visuals, or music and visuals, in creating the total advertising experience.

Another difference worth pointing out is that picture sort data are collected after the advertisement has been experienced and not while the respondent is actually watching the commercial, which is the case with most metered moment-by-moment approaches. As a result, our technique is unobtrusive and does not require a respondent to be continuously introspective and self-aware while watching a commercial. Not only does this phenomenon of watching yourself watching the commercial contaminate the original advertising experience, it also leads to calibration errors generated by reaction time effects.

Another disadvantage of metered moment-by-moment approaches is response momentum. This is the tendency of a respondent to maintain a given rating until the respondent becomes self-aware of what they judge to be a significant change in their attitudes toward the execution. This type of response artifact seems to smooth out or simplify the moment-by-moment profile of a commercial. In contrast, when comparing the two approaches side-by-side for the same ad, picture sort data appears to be “finer grained” than metered moment-by-moment responses. As a result, picture sort data, which is collected in discrete units as a function of the rate of visual information flow in the ad, and not as a mechanical function of time, can produce highly detailed insights into commercials with a complex, high speed editing structure. In a sense, therefore, you might say that metered moment-by-moment approaches produce “analog” data on consumer response to commercials over time, while picture sort data can be thought of as “digital”.

What follows is the first published investigation of this new emotional measurement approach using a substantial sample of television commercials. Where appropriate, relationships between this measure of emotion and other constructs (e.g., attention, branding, recall, and motivation) are also highlighted.

General Methodology

The sample used in this study consists of 120 commercials taken from the Ameritest® database. Each commercial was tested among a mall-recruited sample of 125 to 150 respondents, for which demographics were balanced to census. Consequently, these findings are based on nearly 15,000 consumer interviews.

All of the commercials were tested in the past two years and represent a wide range of package goods categories and brands, from beauty to food to household products, for both established brands and new products. Nearly two-thirds of the commercials were in finished form; the remainder were animatics. The commercials are the work of several major advertisers and more than two dozen large agencies. The sample also includes commercials that were tested for competitive benchmarking purposes as well as our clients’ own work. Therefore, we feel this sample reflects a gamut of creative styles and philosophies.

At the beginning of a 25-minute computer-assisted personal interview, the respondent first sees a clutter reel comprised of the test commercial and several control commercials from non-competing categories, after which the breakthrough measure, the attention score, is collected. Next, the test commercial is shown again by itself and measures of motivation, communication, execution perception and brand perception are collected. Finally, as detailed below, a minimum of two picture sorts for the respective flow measures of attention and emotion are collected at the end of the interview.

We use the same stimuli for both flow measures of attention and emotion: a deck of photographic images created by grabbing key frames from the commercial that represent the visual content of the ad. The number of visuals in a sorting deck is a function of the visual complexity of the commercial, and is not a mechanical function of time — basically two photographic images are included in a deck if we think a normal person could tell them apart. The typical sorting deck contains from twenty to thirty images for a 30 second ad.

To obtain flow of attention information, the respondent first sorts each randomly presented image from the deck into two piles – those she remembers or those she does not remember seeing in the ad. We then calculate the percent of respondents who remember each image over the course of the commercial. This graphic representation of memory over time is what we call the Flow of Attention®. Its shape is usually a highly rhythmic pattern, suggestive of how the audience is actively searching through, virtually consuming, the visual information in the commercial. As we will see below, the peak moments defined by this wave-like structure provide important diagnostic insights into why a commercial is working well or not well on certain key report card measures.

Next, to obtain flow of emotion information, the respondent rates each randomly presented image from the deck on how she was feeling when she first watched it. The construct employed here is to model “emotion” in dynamic terms, as a “fluid” which is pumped through an ad—that is, the more emotionally engaging a commercial is visually, the more emotion is pumped in. To slightly expand the metaphor, emotion is thought of as coming in two types, positive and negative, so that dynamic tension between the two can be analyzed to understand the dramatic structure of a particular commercial. Specifically, the respondent uses a five-point scale ranging from very strong positive feelings (5) to very strong negative feelings (1). We create a graphic representation – what we call the Flow of Emotion® – by superimposing two curves: one represents the percent of respondents who choose the top two boxes (i.e., positive scale points) for each image over the course of the commercial and the other represents the percent choosing the bottom two boxes (i.e., negative scale points) for each image.

Key Findings

Independence of Attention and Motivation

First of all, it is important to establish that the report card measures provided by the Ameritest® system are essentially unrelated variables. Specifically, we found that the correlation between Attention and Motivation was .14. In other words, according to these measurement constructs, knowing that a commercial has the power to break through clutter and get noticed tells us nothing about whether or not the commercial has the power to actually sell something. Conversely, knowing that the commercial has the power to motivate purchase intent tells us nothing about whether or not the commercial will get noticed when you put it on air. Both measures are needed to make a statement about the potential effectiveness of a particular commercial execution. In other words, the measures complement each other.

CapturingFOE_table1

Similarly, we notice in table 1 that the two attention and emotion flow measures are also essentially unrelated variables. The correlations of the average level of recognition in the attention flow to the levels of positive and negative emotion flows are close to zero. There is a small, but statistically significant negative correlation between the number of peak moments in attention flow and the average level of positive emotional flow — though the correlation to negative emotional flow is again negligible. Since peak moments in the attention flow can contain two types of information (either rational, product-related information or emotional, aesthetic content), the overall balance of the rational and emotional information contained in the commercials in our sample may be a factor causing this small negative effect. But, in general, just like report card measures designed to capture overall attention-getting power and overall response to the commercial, the two flow measures are tapping into different dimensions of viewer response to the ads on a moment-by-moment basis.

Interestingly, there is a relationship between the visual complexity of commercials and the two flow measures. The visual complexity of commercials can be determined by simply looking at the number of frames in the picture sorting deck that are needed to describe the ad. With this technique, frames are grabbed as a function of the rate of information flow and not as a function of time. The negative correlation between the average level of recognition in the attention flow with the number of frames in the deck suggests that commercials that are more complex generate a lower average recognition level, as viewers are challenged to process more visual information in the same amount of time. Moreover, the negative correlation with the average level of positive flow of emotion suggests that, on average, viewers respond more strongly on an emotional level to simpler rather than visually complex commercials.

Factors Explaining Attention

CapturingFOE_table2

In general, the attention-getting power of a commercial is a function of two factors: the content and the form of the execution. Attention-getting content provides the viewer with a reward for the thirty seconds of time that the advertiser is asking the consumer to spend with the advertising. This reward can be content that is fun or entertaining or that is unusual and different. This can be seen in table 2 in the execution ratings that are most highly correlated with Attention. Note that from the standpoint of attention-getting power the message does not have to be important, nor does a certain level of confusion form a barrier to the attention-getting power of the ad. These findings probably do not contradict the mental model that many advertising creatives have of how advertising works.

The “form” of the execution, which we think of in terms of cognitive processing or how viewer attention is structured and focused by the film syntax of the ad, is captured by our non-verbal attention flow measure. The number of peak moments, which is indicative of clear narrative structure or simply good storytelling, is significantly correlated with the attention score. Note, however, that it is aesthetic content, not product information contained in a peak moment, which drives overall viewer attention.

Factors Explaining Purchase Intent

The factors that explain a good purchase intent score are, in general, quite different from those that explain attention. Purchase intent is primarily a function of communicating a message which is important to the consumer, without confusion, and particularly in a situation to which the consumer can relate. The uniqueness of the execution is not a factor in driving purchase intent. Significantly, purchase intent is also strongly correlated to the amount of emotion being generated by the execution, as shown by the emotion flow measure.

Interestingly, the entertainment value of the execution is correlated with purchase intent, though not as strongly as it is to attention-getting power. Music, which contributes a strong emotional component to commercials, is at least as strongly correlated with purchase intent as it is with attention. Emotion and the entertainment value of an execution may be related, but as we will see below, they are not the same thing.

As noted above, visual simplicity — i.e., fewer frames in the commercial description — is linked to the amount of emotion flowing through an ad, and it is also linked, possibly for that reason, to purchase intent.

The total amount of information processed by the viewer, as shown by the attention flow measures, is not correlated with purchase intent. In other words, communicating the right message — even if it’s only a single message — is more important to making a sale than delivering a lot of information.

Putting these rational and emotional components together, therefore, we conclude that purchase intent is a function of communicating a simple, relevant message in a dramatic way.

Factors Related to Liking

Given the importance of liking in the literature, it should be noted that liking is significantly correlated with both the attention-getting power of an execution and it’s power to drive purchase intent. This is certainly consistent with the ARF finding linking liking and sales effectiveness. Interestingly, liking is more strongly correlated with purchase intent than attention. As we will see next, this is because commercial liking is related more to whether an ad generates relevant thoughts and emotions in the consumer than it is to its entertainment value.

Factors Related to Emotion

CapturingFOE_table3

Liking is, of course, linked to the amount of emotion flowing through a commercial — the higher the average level of positive emotion, or the lower the level of negative emotions, the better liked an execution is (see tabkle 3).

Interestingly, while there is a statistically significant association, the correlation between entertainment value and the average flow of positive emotion is surprisingly low. The correlation with commercial uniqueness is related more to the negative emotions generated, which may be a result of “edgy”, creative executions pushing the envelope and crossing the line for some members of the target audience.

Stronger correlations to the emotional flows can be found in the items pertaining to the relevance of the message and the situation being depicted in the ad.

A significant barrier to the flow of emotion through an ad is confusion. In particular, executions in which the role of the product in the narrative is unclear, or in which the audience does not understand the purpose of the visuals, are strongly associated with negative emotions.

The flow of emotion through a television commercial, therefore, appears to be strongly linked to the flow of meaning through the ad. This suggests another analogy. In physics, the distinction is made between the total energy contained in a system and the amount of energy available to do useful work, or working energy. Similarly, we can think of the total emotion being generated by the system of symbolic content in a television commercial as being different from the “working emotion” being generated by an ad, which is that portion that attaches to the brand and helps drive the sale. The left-over part of emotion is simply the “borrowed interest” of an execution that entertains or dazzles but does not sell. With this construct in mind, it appears that the measure of the flow of emotion reported here taps more into the working emotion side of the advertising.

Consistent with the findings reported above, the attention-getting power of a commercial can be thought of in terms of the form and content of the execution. Entertaining or unusual content, which rewards the viewer for the time they spend watching the commercial, is one success factor. The other success factor is whether or not you have a well-formed execution in terms of film grammar or syntax, which makes it easy to cognitively process the information in the commercial. The first factor can be measured verbally, with traditional pre-testing measures, while the second factor necessarily requires a non-verbal form of measurement.

On the other hand, the motivational impact or persuasiveness of a commercial is again a function of two factors. Communicating a relevant idea on a rational or semantic level is one driver of purchase intent. But, as Hollywood has taught us, film has the power to make an idea seem larger than life. The working emotion generated dynamically by moving visual images can magnify the impact of the rational communication on sales. Again, the first factor can be assessed with verbal measures, whereas the second requires a non-verbal approach.

Conclusion

In this paper we have described a new but simple approach to measuring the emotional content of television commercials and have related it to more traditional pre-testing measures. The role of emotion in driving commercial liking, attention-getting power, and purchase intent is a complex subject, and much work remains to be done in this area. Ultimately, the effectiveness of advertising must be thought of in terms of the experience it creates for the viewer, and emotion has an inescapable role to play in that experience. We must remind ourselves that human experience is larger than language — hence the need for developing non-verbal measures for describing that experience.

Moreover, by describing emotion as something that flows through an ad, rather than in static terms as a property that is or is not present in an ad, we are proposing a construct that we hope may lead to new insights into the problem of measurement and modeling the role of emotion in advertising effectiveness. As others before us have pointed out, emotion should be thought of in terms of dynamic imagery. Socrates gave us the image of the wild horse. Measuring its movement and harnessing its power is left to us.

References

Alwitt, L.F. “Components of the Likeability of Advertising”, Presented to Stellner Symposium on Uses of Cognitive Psychology in Advertising and Marketing, University of Illinois, May l987.

Anderson, J. R., Cognitive Psychology and its Implications. Worth Publishers, New York, 2000.

Haley, R. and A.L. Baldinger, “The ARF Copy Research Validity Project”, Journal of Advertising Research, 31.2 (1991): 11-32.

Hoffman, D., Visual Intelligence: How We Create What We See, Norton & Co., New York, 1998.

Kastenholz, J. and C. Young, “How Recall Misses the Emotion in Advertising that Builds Brands,” Transcript Proceedings: Advertising Research Foundation, New York, 2003.

Leahy, T. H., and R. J. Harris, Learning and Cognition, Prentice Hall, New Jersey, 2001.

Lodish, L., M. Abraham, S. Kalmenson, J. Livelsberger, B. Lubetkin, B. Richardson, and M. Stevens, “How T V Advertising Works: A Meta-Analysis of 389 Real World Split Cable TV Advertising Experiments,” Journal of Marketing Research, Vol. XXXII (May 1995), 125-139.

Madden, T.J., C.T. Allen, and J.L. Twibble, “Attitude Toward the Ad: an assessment of different measurement indices under different processing sets”, Journal of Marketing Research, 25.3 (1988): 242-52.

Plato, Phaedrus, The Great Books, Encyclopedia Britannica, Vol. 7.

Polfuss, M. and M. Hess, “Liking Through Moment to Moment Evaluation: Identifying Key Selling Segments in Advertising,” Advances in Consumer Research, Vol. 18, 1991: 540-544.

Spaeth, J., M. Hess, and S. Tang, “The Anatomy of Liking”, in Transcript Proceedings: Seventh Annual Advertising Research Foundation Copy Research Workshop. New York: Advertising Research Foundation, 1990.

Walker, D., “Mood and Emotion in Television Advertising”, Presented to Fifth Annual Advertising Research Foundation Copy Research Workshop. New York: Advertising Research Foundation, l988.

Walker, D., and T. Dubitsky, “Why Liking Matters”, Journal of Advertising Research, May/June 1994.

Youn, Seounmi, T. Soun, W. Wells, and X. Zhao, “Commercial Liking and Memory: Moderating Effects of Product Categories”, Journal of Advertising Research, May/June 2001, Vol. 41, No. 3.

Young, Charles E., “Brain Waves, Picture Sorts® and Branding Moments,” Journal of Advertising Research, July/August 2002, Vol. 42, No. 4, 42-55.

_______, “Researcher as Teacher”, Quirks Marketing Research Review, March 2001, 21-27.

_______, “The Visual Language of Global Advertising”, Admap, April 2003, 37-39.

_______ and M. Robinson, “Video Rhythms and Recall”, Journal of Advertising Research, June/July l989, Vol. 29, No. 3.

____________________, “Visual Connectedness and Persuasion,” Journal of Advertising Research, March/April 2002, Vol. 32, No. 2, 51-59.

____________________, “The Visual Experience of New and Established Product Commercials,” Advances in Consumer Research, Vol.18, 1991: 541-549

Zielski, Hubert, “Does Day After Recall Penalize Emotional Advertising”, Journal of Advertising Research, Feb/Mar l982, Vol. 22, No. 1.