Determining Price: The van Westendorp Price Sensitivity Meter

Determining the best price for a product or service is a common marketing research question.  I usually start my conversation with a client asking whether their product has all of its features set or if they also need to test a range of features other than price.  If they are testing variable features in addition to price, we start to talk about conjoint (see here for a video on the current state of affairs in conjoint).  However, if they tell me that their product features are set and they just want to look at price, one of the things we’ll likely discuss is the van Westendorp Price Sensitivity Meter (let’s just call it VW).

I was recently corresponding with a colleague (Dave Lyon of Aurora Market Modeling) and the discussions led me to look back at the original VW paper (Peter H. van Westendorp (1976), “NSS – Price Sensitivity Meter (PSM) – A New Approach to Consumer Perception of Prices,” in Venice Congress Main Sessions. Amsterdam: European Marketing Research Society (ESOMAR), 139-167.)  In my conversations with Dave, one of the issues that arose was the way many modern researchers calculate the point of marginal cheapness.  Are most researchers incorrectly calculating VW’s outputs?  What might van Westendorp himself say about this?  How about a little background before going into this point?
[Read more...]

Highlights from the Sawtooth Software Conference 2010 – Day 4

Day 4 is a half-day, but it had some very serious papers and great discussion – the marketing science community is alive and well here!

Modeling Demand Using Simple Methods: Joint Discrete/Continuous Modeling (Tom Eagle, Eagle Analytics of California)

  • Discussed 4 approaches volumetric modeling (find out not only “what”, but “how many” of something respondents prefer.
  • Examined Regression Models, Choice Models, Economic Models, and Joint Discrete/Continuous (D/C) Models, with a focus on the latter.
  • D/C Models estimate in two stages: first fit an allocation Multinomial Logit model, then fit a general linear volume model using predictions from stage 1 as independent variables.
  • Compared all methods on 4 relatively simple datasets.
  • Summary of comparisons: Choice Modeling performs as well or better than the joint D/C models.
  • Conclusions: Joint D/C volume modeling is a valid approach to modeling complex volume models.  Because all these techniques occur on the back-end, you can design your study the same way and try out different approaches once data is collected.

Recent Developments in PLS Modeling: An Application for Customer Loyalty and Retention (Stuart Drucker, Drucker Analytics)

  • This talk was about predictive analytics, specifically key driver analysis
  • The way NOT to solve this problem is using stepwise regression (TAKE NOTE: lots of people still do this!).
  • Discussed Partial Least Squares (PLS) Regression and PLS Structural Equation Modeling (PLS-SEM).  In these approaches, factor analysis is web with regression analysis into a unified (more confirmatory in the case of PLS-SEM) framework.  Better manages common issues including multicollinearity.
  • Conclusion: The decision of which model to accept is a philosophical one, depending on whether the ultimate focus is estimating a system of effects (including Customer Satisfaction and the Key Drivers) – use PLS-SEM, or if the focus is maximizing the explained variation of Customer Retention – use PLS.

A Head-to-Head Comparison of the Traditional (Top Down) Approach to Choice Modeling with a Proposed Bottom Up Approach (Don Marshall, TVG, Siu-Shing Chan, Univ. of Pennsylvania, and Joseph Curry, Sawtooth Technologies)

  • Huge effort involving lots of people to set up this experiment.
  • Based on Jordan Louviere’s recent assertion  (2009) that when using HB to measure preferences that we were just capturing respondent inconsistency and that we needed to stop modeling the way most people currently are.
  • Compared current gold standard (Hierarchical Bayes (HB)) where individual preferences are influenced by group averages (called the “Top-Down” approach), to the “Bottom-Up” approach that examines individual preferences independent of group averages.
  • In Top-Down, respondents see different choice sets and choice the best, with a dual-response none follow-up, and are shown fewer screens.
  • In Bottom-Up, respondents all see the same choice sets, they choice top choice, last choice, and whether all/some/none are acceptable, and are shown more screens.
  • Conclusions: while the design and analysis criteria for bottom-up continue to evolve and improve, this analysis provides no compelling reason to recommend bottom-up over top-down at this point.  Interview length and completion rates favor top-down.

HB-CBC, HB-Best-Worst or No HB at All? (Ralph Wirth, GfK Group) – Note: this paper won “best paper” for the conference.

  • Concerns have been raised regarding CBC-HB, and this paper used Monte Carlo Simulations and 4 real-world datasets to find out if the concerns were justified.
  • The Best-Worst idea in CBC is gaining interest – the idea is to show profiles and have respondents choose not only their most, but also their least preferred option.  Compared this approach to standard CBC-HB and also a Louviere approach which asked for most preferred, least preferred, and an in-between choice, and did NOT use HB for analysis.
  • Conclusions: no model was consistently superior based on fit.
    • The Louviere approach is worth considering when data conditions are good and/or the focus is on share prediction rather than prediction of individual choices.  Purely individual estimation makes it much simpler than HB approaches but seems detrimental when data conditions are sparse.
    • The HB approach has good overall performance also under sparse data conditions, there is no negative influence of individual-specific error variances, and the results suggest that the use of additional preference information from worst-choice leads to better estimations, as Best-Worst CBC-HB is consistently superior to standard CBC-HB.
    • Discussion: lots of opinions here, but some main take aways are that:
      • HB will not fall out of use anytime soon, as it appears to perform well under a number of situations.
      • We need to model (simulate) against real-world outcomes whenever possible.
      • Best-Worst CBC is emerging as something to keep exploring, however, there is potential to get into a lot of trouble during the analysis, and there is no commercial tool that provides a solution (other than doing completely custom-programmed analytics).

Conference Conclusions

  • If you are involved in doing conjoint analysis, or other varieties of research that seek to understand preferences and choice behavior, this conference is a must attend.  The top minds in the field are all here and they are pushing the boundaries to achieve better measurement of choice behavior.  The conference occurs every 18 months, and information can be found at www.sawtoothsoftware.com.
  • Note that Jordan Louviere, a key discussant at the conference, was the recipient of the AMA 2010 Parlin Marketing Research Award.  See his interview in the 9/30/10 edition of Marketing News.
  • Plus, the food was really good :)

Highlights from the Sawtooth Software Conference 2010 – Day 3

For the morning sessions, the main conference was joined by the attendees of the conjoint analysis in healthcare conference that is running here in parallel.

The Value of Conjoint Analysis in Healthcare for the Individual Patient (Liana Fraenkel, Yale University of Medicine, VA Connecticut Healthcare System)

  • Looked at conjoint as a way to elicit patient preferences in low certainty situations.
  • Research shows that eliciting patient preferences can have some positive effects.
  • Conjoint analysis is a natural fit, given the trade-off approach.
  • Used Adaptive Conjoint Analysis to provide individual-level interviews.
  • Also looked at using MaxDiff, although it appeared that some respondents had difficulty with the best-worst trade-off, therefore tried a best-only approach.
  • Pros = works at the individual patient level, can handle lots of info, can provide immediate feedback, trade-offs are like real life, discourages rating all features equally
  • Challenges = hard to get independent attributes, hard to specify levels of attributes, have ranges of levels, have dominant attributes, can be difficult for respondents.
  • Reality = patients so rarely are given choices that they don’t know how to react, MD buy-in, discordance between patient preferences and what MD things should happen.

Tailoring Treatment Based on Preference Values (Marsha Wittink, Univ. of Rochester School of Med, Univ. Of Pennsylvania School of Med)

  • Goal was to use conjoint to identify which attributes of treatments are most important to help design better interventions.  Focus here was on patients with depression.
  • Used Hierarchical Bayes to calculate individual level utilities, and also used latent profile analysis to look for unique groups.
  • Did find unique groups preferring different treatment modalities.
  • Plan to use data to assess whether these preferences are better predictors of treatment uptake than other demographics.  Could also use to tailor treatments.

Conjoint Design Effect on Respondent Engagement (Paul Johnson, Western Wats)

  • Looked at CBC tasks with 20 vs 30 cards, and also at Adaptive CBC – looked at placement of conjoint task before or after other survey questions.
  • No differences in the way respondents answered other questions based on type or order of conjoint.
  • Purchase intent higher with ACBC task – might put respondent more in the mood to buy.
  • Respondents did NOT speed through the rest of the survey after doing the conjoint first.
  • Time spend on conjoint task longer for ACBC.
  • Few to no differences in measures of consistency, hit rates, or error levels in model.
  • ACBC did best job of predicting a winning holdout concept.
  • Other benefits of ACBC: get explicit rules of non-compensatory rules made by respondents, get more stable model estimates with smaller N’s.

Sales Promotions in Conjoint Analysis (Marco Hoogerbrugge & Eline van der Gaast, SKIM Analytical)

  • Looked at the best ways to present price promotions for testing in conjoint analysis.
  • Ideas of ways to display:
    • Original (gross) price + % or $ off.
      • When model, need to assign levels to final price.
  • Original (gross) price NOT shown, promotion (net promoted price) only thing shown.
  • Original (gross) price + promotion price shown and highlighted.
    • Could model just net price, or keep original price as main effect, or include interactions between original and promotion price in model.
    • Modeling would benefit from data about purchase behavior – more external data.
    • Note that you’ll need to model differently based on which approach you take.

How Many Questions Should You Ask in CBC Studies? – Revisited Again (Jane Tang & Andrew Greenville, Vision Critical)

  • Past (and more recent) research
    • Johnson & Orme (1996): ask up to 20 tasks; in later tasks brand becomes less important, price more important, and more likely “none” choice.
    • Hoogerbrugge & van der Wagt (2006): increase in hit rate after 10-15 tasks is small; complexity of study more influences hit rate.
    • Markowitz & Cohen (2001): HB hit rates not greatly enhanced by increasing sample size; more choice sets better than more sample.
    • Suresh & Conklin (2010): complex survey design leads to lower respondent engagement; more complex attributes leads to choosing “none” more often and more price reversals.
    • Hauser, Gaskin & Ding (2009): Non-compensatory rules used when more time pressure, more products, and more familiarity with the category.
    • Current study looked at 3 conjoint tasks: 6 cards with 3 options, 15 cards with 3 options, and 15 cards with 5 options.
    • Conclusions:
      • Increasing the # of tasks gave limited improvement in prediction ability, but at cost of slight deterioration in sensitivity and consistency.
      • Simplifying behavior occurs in later tasks – more likely so when respondent familiar with the category.
      • Increasing complexity of task (showing 5 vs 3 options) doesn’t help anything.
      • The balance is this: sometimes, individual level models don’t converge when have small # of tasks, so need to ask more tasks to get precision, but as do so, reliability goes down – need a balance .

The Strategic Importance of Accuracy in Conjoint Design (Matthew Selove, USC, John Hauser, MIT Sloan)

  • Looked at what happens when we have “noise” in a sample versus “heterogeneity”
  • Noise leads to less differentiation in product decisions, whereas heterogeneity encourages differentiation.
  • Compared a “well” and “poorly” design conjoint study.
  • Poor designs lead to more noise, which leads to inconsistency in the conjoint task and less ability to validate to hold-outs.
  • Need to accurately estimate randomness.  If you have no hold-outs to validate against, Louvierre recommends tuning model to an exponent of 0.4.

Product Portfolio Evaluation Using Choice Modeling and Genetic Algorithms (Chris Chapman, PhD & James Alford, PhD, Microsoft)

  • With conjoint data, we know how to optimize a product, but what about a product line?
  • Took CBC and ACBC data, derived individual-level partworths using HB model, iterated using a genetic algorithm to fit many portfolio preference models, and inspected.
  • Took 1080 possible products (based on 9 attributes with 2-7 levels) and found that after 6-8 products in the portfolio there was no more increase in share of preference.
  • Also asked which products appeared in a large number of winning portfolios – found a couple “new” products this way.
  • Other findings: CBC had much noisier price data than did ACBC; ACBC has more stakeholder face validity, smaller sample sizes needed, and better respondent engagement.

The Impact of Covariates on HB Estimates (Keith Sentis & Valerie Geller, Pathfinder Strategies)

  • Note: this presentation and the next were probably the most controversial and shocking to most of the conference participants.
  • Method: estimate HB partworths with and without a covariate, compare the quality of the two sets of partworths; do this for several different covariates, one at a time.  Looked at 3 classes of covariates: demographic variables, category behavior variables, and attitudinal variables.  Measures of fit and predictive accuracy included: RLH, Hit Rate, Holdout Likelihood, and MAE (error).  Measures of partworth variability included: importance spread and standard deviation ratio.  Ran across 5 datasets.
  • Conclusion: NO lift in predictive accuracy by using covariates; did see some increase in partworth variability.
  • Discussion: should covariates be in our tool kits?  This paper says NO, but why might we want to include them?  Carefully chosen covariates can provide insights by subgroup.

Added Value through Covariates in HB Modeling (Peter Kurz, TNS Infratest Forschung GmbH, and Stefan Binner, bms marketing research + strategy)

  • Since HB assume one single multivariate normal population, can get “shrinkage” of respondents  toward the population mean, and therefore segment differences could be reduced.  So, how about adding covariates into the upper level model?
  • Method: looked at 10 commercial studies, 30,000+ conjoint interviews, B2B, B2C, worldwide; looked at natural (demographic), segmentation membership, and intention or past behavior data as covariates.
  • Conclusion: they ran all kinds of models, from standard HB to HB with covariates to Latent Class, etc., and HB with covariates did better only on 2 studies.
  • Recommendations: ensure sufficient sample size, use standard HB, use hold out tasks.
  • Discussion: If have clearly defined clusters, covariates could help; more heterogeneity can help with market share projections, line extension and optimization decisions, estimates if willingness to pay, and it helps IIA (red bus-blue bus problem).

Regarding these last two papers, everyone really wanted to believe the covariates could helpful, but the evidence argued against improved predictive accuracy by using covariates.

Highlights from the Sawtooth Software Conference 2010 – Day 2

The sun came out in Newport Beach on Wednesday, but with 10 presentations there was no time to play. Here are some highlights on each of the talks.

What Drives Me? Developing a Conjoint-Based Recommendation Engine for Individual Vehicle Consideration (Ely Dahan, UCLA, Claremont, Princeton)

  • Individual-level conjoint engages the respondent in an adaptive, engaging exercise with the goal of helping them gain personal insight and get a recommendation.
  • Used the example of a car recommendation engine in development for Edmunds.com.
  • Uses pre-existing market data, incorporated with individual choices.
  • Has good promise for the area of consumer search.

Analyzing Consumers’ Screening Rules by Means of Virtual Online Shops (Soren Scholz & Reinhold Decker, Bielefeld University, and Beate Sarnowski & Marie Schuir, Interrogare)

  • A virtual online shop was developed to help overcome some of the limitations of traditional conjoint approaches when studying complex purchase decisions.
  • Their tool picks up on non-compensatory decision making behavior (note that Adaptive Choice-based Conjoint does as well).
  • Good predictive validity (hits on hold-out tasks).
  • Can be combined with conjoint techniques to improve measurement.

The Success of Choice-Based Conjoint Design Among Respondents Making Lexicographic Choices (Keith Chrzan, John Zepp, & Joseph White, Maritz Research)

  • A lexicographic choice model involves a sequential choice process – like non-compensatory decision making (setting rules or cut-offs, must-haves and unacceptables).
  • Research shows between 20% and 66% of respondents use a lexicographic choice process.
  • Be careful with minimal overlap designs – designs with overlap can be more informative.
  • Best-Worst Discrete Choice Analysis (where not only a best is chosen from a consideration set, but a worst is, as well) helps deal with lexicographic responders.
  • The idea is that if someone is going to make a cut-off type of choice, we’re going to lose information about other attributes, so we need to do things to get them to answer about secondary attributes. Note that Adaptive Choice-Based Conjoint does this well.

Menu-Based Choice Modeling Using Traditional Tools (Bryan Orme, Sawtooth Software)

  • The idea of selecting from menus in a choice exercise makes sense given all the times in the real world that respondents do so (restaurants, configure a computer, cars, telecom/internet/phone bundles, single or multiple drug therapies).
  • These designs are custom built, and currently, are custom analyzed.
  • Counts analysis won’t cut it; volumetric allocation, serial cross-effects, and exhaustive alternatives models are being used.
  • Exhaustive alternatives shows promise – has the benefit of being a single model, but can have a large number of parameters to estimate. Be careful of overfitting.
  • Aggregate Logit interestingly does as well as HB in some tests.

Analyzing Pick n’ Mix Menus via Choice Analysis to Optimize the Client Portfolio (Chris Moore & Corrine Moy, GfK NOP)

  • Moving from showing respondents a portfolio of “fixed” products for the consumer to choose from to a portfolio of features that consumers can pick and choose from to design their own product.
  • Focus on identifying a set of features that should be offered, how consumers want to customize their product, the premium (if any) consumers are willing to pay for a customized product, and how to price each feature to simultaneously increase customer value/revenue/profit.
  • Serial cross-effects model appears to be very robust with excellent holdout validation.
  • It’s important to include LOTS of sample (think N=1000).
  • Be careful about overloading respondents with too long/difficult a task.

An Empirical Test of Bundling Techniques for Choice Modeling (Jack Horne, Silvo Lenart, Paul Donagher, & Bob Rayner, Market Strategies International)

  • Some are using the term “Build Your Own” for Menu-Based Choice.
  • The “Reservation Price” is David Bakken’s idea of the highest price an individual consumer is willing to pay for a given product.
  • Key assumptions for bundling research:
    • When offered individually, consumers will buy products that are less than or equal to their reservation price and not buy products that are greater than their reservation price.
    • When responding to fixed bundles, the reservation price for the entire bundle (not the individual component products) will determine purchase.
    • This bundle reservation price may or may not be the sum of the reservation prices for the individual items making up the bundle.
    • Single Brand BYO asks respondent to build product for a single brand; Market BYO does the same for multiple brands; Fixed Bundling is like discrete choice (show products – which one prefer).
    • These methods force respondents to react to different marketplaces, and they produce different results.
      • Single Brand BYO: have fewest products chosen, often times only one.
      • Market BYO: larger predicted take rates of individual products.
      • Fixed Bundles: price sensitivity curves are flatter; larger predicted take rates of individual products.
      • Revenue and product penetration are maximized using Marker BYO or Fixed Bundles
      • Single Brand BYO and Market BYO also reach those individuals interested in single products who may not be interested in purchasing Fixed Bundles. In this way, the BYO methods can compliment Fixed Bundles.
      • When there is no brand, Single Brand BYO may be a good alternative for measuring generic willingness-to-pay and take rates.

Anchoring Maximum Difference Scaling Against a Threshold: Dual Response and Direct Binary Responses (Kevin Lattery, Maritz Research)

  • Following up MaxDiff questions with additional questions to tease apart respondents who have the same item importance rankings but could have gotten there by different means.
  • Louviere’s Indirect Method asks a question after every MaxDiff screen: “Considering just the 4 features above, which of the following best describes your views about which features are very important for your ideal [product]?” Response set = All 4 of these features are very important, None of these 4 features are very important, Some are very important, some are not.
    • Awkward and additional respondent work; appears to be reintroducing some scale/disposition bias, and if results used in segmentation, could dominate the segmentation. The more items shown per task, the more likely the outcome will appear indeterminate.
    • Direct Method is asked only once at the end of the MaxDiff exercise: “Please tell us which of the features below are very important for you ideal [product]?” Response set = all MaxDiff items (or a couple pages with randomized sets if item list is very long.
      • Much quicker; respondents more critical (less items very important); some context dependency for which items are chosen.

Directing Product Improvements from Consumer Sensory Evaluations (Karen Buros, Radius Global Market Research)

  • Discussed the issue that consumer product evaluations on taste, smell, texture, color and other sensory perceptions lack specificity.
  • Penalty analysis is the traditional approach, however, it can yield conflicting directions for product improvement (e.g., overall flavor too strong, and overall flavor too weak)
  • Tried to understand product perceptions using minimally verbal scales (bi-polar), have respondents rate product on multiple dimensions and use Latent Class regression to derive respondent-level coefficients measuring the impact of attributes on purchase intent.
  • Issues = multicollinearity, sensory attribute interactions, lacking chemists on staff (had to opt for perceptions).
  • Jury still out on effectiveness.

Policy Implications on the Diffusion of Alternative Fuel Vehicles: An Agent-Based Modeling Approach (Rosanna Garcia, Northeastern University, and Ting Zhang (Xi’an Jiaotong University)

  • Used agent-based modeling to model the interactions between technology push, regulatory push and market pull on eco-innovation.
  • Used conjoint data (discrete choice) to gauge market pull.
  • Used NetLogo to do agent-based modeling (ABM).
  • Found the use of conjoint data a great way to instantiate and validate ABM.

The Impact of Respondents’ Physical Interaction with the Product on Adaptive Choice Results (Bob Goodwin, Lifetime Products)

  • Wanted to determine the potential impact of respondents’ physical interaction with the product on the precision of adaptive choice-based conjoint (ACBC) results.
  • Split-sample ACBC studies of folding banquet tables and chairs were conducted using online and mall-intercept field methods. Market simulation results were then validated using actual product sales and market share distributions.
  • For tables, “touch and feel” less important than previously thought. In-person concentrated more on leg style, online more on size/shape.
  • For chairs, mall “butt test” didn’t look that important; however, the “mesh” chair was impossible to validate as it had no sales data. Only very minor differences in attribute importances.
  • Concluded that cost-benefit for in-person tests not justified, as online method had less prediction error for both tables and chairs.
  • Note, however, that new, innovative, or unfamiliar products may need to be demonstrated in person.

Using Eye Tracking and Mouselab to Examine How Respondents Process Information in CBC (Martin Meissner, Soren Scholz, & Reinhold Decker, Biefeld University)

  • Asks whether adding eye-tracking or mouselab data to choice-based conjoint (CBC) could improve modeling.
  • It was somewhat challenging to match up eye tracking data directly to attributes. And mouselab introduces a different cognitive process, as the image of the attribute level only appeared when a respondent moused over it, and disappeared when not moused over.
  • Found that added eye tracking data did improve models, especially when the intensity of the tracking to each attribute was quantified and incorporated into the model.
  • Concluded that eye tracking data can be used to qualitatively analyze how respondents approach purchase decisions, cross-check and validate CBC utilities and importances, check whether the relevance of attributes and way of information processing changes during interviews, and significantly improve choice models.
  • Also learned that only a minority of respondents consistently apply simple decision heuristics, like focusing on price only, repeated measurements do not seriously harm information processing, warm-up tasks facilitate a more holistic evaluation of alternatives, and the design of the task affects respondent information acquisition behavior (more complexity changes behavior).
  • Issues = adding person-specific data can lead to overfitting the model. The cognitive processes are still somewhat unknown – need fMRIs to really tell. Eye tracking is currently expensive to implement and adds a lot of time to the task.

Some Highlights from the Sawtooth Software Conference 2010 – Day 1

[Editor's Note: Research Access contributor Nico Peruzzi is on-site at the Sawtooth Software conference this week, and sends us this dispatch.]

The weather is pretty grey in Newport Beach, CA, but that’s good because we can focus on the talks.  Day 1 is workshop day, and here are some highlights of the two that I attended.

Adventures in Advanced CBC Applications

Presented by Bryan Orme, President of Sawtooth Software and David Lyon, Principal at Aurora Market Modeling, LLC, this workshop covered lots of different advanced applications of Choice-Based Conjoint (Discrete Choice Modeling).

Here are just a few highlights of this four-hour workshop:

  • Alternative-Specific Designs give the flexibility to show unique sets of attributes based on the context (e.g., if asking about choices for short-range travel, when “drive my car” is shown, “parking fee $8.00” is also shown.  However, when “ride the bus” is shown, “picks up every 20 min.” and “75 cents per one-way trip” are shown.  These designs make the exercise more realistic.
  • Prohibitions (restricting attribute levels that can be shown in combination) can be a red flag for the way you have set up your attributes and levels.  General rule: if the respondent will be confused by a combination, use a prohibition; however, if it’s simply “we don’t offer this combination”, then you don’t have to use a prohibition.  Remember that prohibitions reduce the efficiency of your design and you’d need more sample to overcome this issue.
  • Condition Pricing (varying the price points shown conditioned on the presence of a level of a certain attribute) only makes sense is levels of price are in some sense consistent across conditions: typically the same $ difference or same % difference).
  • Hierarchical Bayes is general the best analysis method for these scenarios – certainly better than Aggregate Logit – and this is true of even basic CBC analyses.
  • Summed Pricing extends the idea of conditional pricing and attaches incremental price changes to each level of each attribute.  It’s important to add +/- X% to summed prices so that price is not confounded with levels.  Doing so helps partial out the effects of each level; so that the utilities are not “conditional” on the price.  +/- 30% was suggested as the random variation to be applied.
  • Bundling was discussion, and it can get complicated, but the idea is to create a custom design that mimics the purchase process as much as possible.  Modeling and simulation in this case is custom.
  • Volumetric CBC asks respondents to declare a “quantity” of each choice that they would buy, versus making a “discrete” choice as to which one they would buy.  Volumetric CBC appears better at revealing near-term spikes in market share, whereas, discrete choice is better at showing a long-term equilibrium for a product (it is more stable over time).
  • “Evoked Set” CBC takes a reduced set of attribute levels forward into the task (e.g., start with 14 levels, but respondents will only answer about 7).  Takes some fancy data-processing work.
  • Build-Your-Own (BYO) is task where respondents see all attributes and choose which level of each that they prefer.  BYO can be a good “training task” for respondents, especially if they are going into a complex choice task.  BYO is built into Sawtooth Software’s Adaptive Choice-Based Conjoint.
  • Menu-Based Choice (MBC) appears to be what will become the newest addition to the conjoint family.  Design is very customized and respondents have an opportunity to choose various items from a “menu” to build their preferred product.  Counts analysis can work for now, but Hierarchical Bayes (HB) analyses will likely form the basis of models being developed to manage this type of data.  However, early research suggests that HB doesn’t give as much lift over Aggregate Logit in menu-based choice as it does in regular CBC.

Research for Solid Pricing Decisions

Presented by David Lyon, Principal at Aurora Market Modeling, LLC, this workshop presented an overview of commonly used pricing research.  The workshop was organized in terms of direct questioning techniques and trade-off methods.

Direct Questioning Techniques:

  • Willingness to Pay: “How much are you willing to pay for this?”.  At best, for a radically new product, gets you near the ballpark.  Don’t pre-list answer choices – wipes out upside possibility.  Plot % willing to pay at a certain price.  Overall, pretty weak technique.
  • Monadic Designs: split sample into groups and present different price to each.  Best to add a buy-response question (“Would you buy it?”, or better yet, a less-variable measure of purchase intent such as an intent scale or allocation/likelihood.  Use large samples and match cells carefully.
  • Sequential Monadic: ask initial, fully-disguised monadic question, then follow-up with “What about this price?”, “What about that price?”.  Sometimes down low to high price, or high to low, or “Gabor-Granger” (random order).  Problem = no way to disguise focus on price in follow-ups, and results in consistent over-estimation of price sensitivity = not realistic.  Huge biases here.
  • Van Westendorp Price Sensitivity Meter: four questions: At what price would you find the product… “too cheap”, “cheap”, “expensive”, “too expensive”?  Curve crossing analysis shown to be unrealistic; try plotting % of respondents who fall in the “normal” range (between cheap and expensive) against prices.  Ok for early exploration; view with skepticism.
  • Newton-Miller-Smith variation of Van West: add 2 questions: at [cheap price], likelihood to buy?, at [expensive price], ditto.  Translate likelihood scale into purchase probability, average probability curves over all respondents.  Problem = most of us can’t do translation to probabilities.

Trade-Off Techniques:

  • Ratings-Based Full Profile Conjoint: present profiles, have respondent rate or rank them.  Problem = systematically underestimates price sensitivity.  More concrete attributes like price are under predicted, whereas more emotionally laden attributes are over estimated.
  • Price-Only Choice-Models: allows showing different prices for different brands, allows different price utilities for different brands, and avoids systematic underestimation of price effects.  Think of each product’s price as a separate “attribute” in the conjoint sense.  Can fractionalize the design so each respondent does not need to do the whole design.  Usually modeled at aggregate level, but using HB would be better.  Issue – cannot simulate with products deleted from or added to the basic set we designed around.  Pretty face valid that we are testing price, but experience shows that price sensitivity is realistic.
  • Discrete-Choice Modeling: (Choice-Based Conjoint): a whole talk in and of itself, but here are a few highlights.  Add other attributes to price-only to provide more realism to the task and decrease bias toward oversensitivity.  Using Multinomial Logit introduces the “red bus – blue bus” problem (independence of irrelevant alternatives: IIA) – means that if we have 4 products with shares: 40%, 30%, 20%, and 10%, and say we cut the price for the first product so that its share increases to 50%, IIA takes the loss of 10% and distributes it proportionally away from the other 3 products (i.e., we now have, 50%, 25%, 16.7%, and 8.3%).  Logit does this without “thinking”, no matter what the data might say.  Use HB instead to Aggregate Logit to solve this problem.

Reverse Segmentation – The Promised Land for Target Marketers?

This information is useful for people who are tired of segmentation projects that give only unique attitudinal/behavioral segment descriptions or unique demographic/targeting profiles, but not both.

In my last blog post, I gave a general overview of customer segmentation, and I left off with a teaser about reverse segmentation.  Well, here goes.

A Common Criticism of Segmentation

To repeat what I said in my last post, “I got these great-sounding segment names, but they don’t have distinct demographic targeting profiles, so I can’t reach them.”   Traditional attitudinal/behavioral segmentations do a good job of identifying meaningful groups, but targeting the market can be difficult.  Alternatively, demographically-defined segments can be targeted but are often indistinct attitudinally or behaviorally, so they are difficult to “message to”.

Reverse Segmentation Solves These Problems

Reverse segmentation helps the market researcher identify market segments with highly differentiated attitudes and behaviors, while at the same time considering the demographics/ firmographics, media usage, or channel usage information that is needed to reach people and deliver a targeted message.

Before You Begin

I’m not going to spend much time discussing data collection for your segmentation analysis, but of course, you want to have attitudinal and/or behavioral data (probably from a survey, and if you’re lucky, tied to actual behavioral data) and targeting variables (demographics, media habits, etc.) that you hopefully already have in your CRM database.  Ask yourself what attitudinal/behavioral variables would help you describe your customers or prospects – what messages resonate with them, what do they currently own/buy, what are their interests, etc.?  Then, ask yourself how you target, or could target, possible customers – what “selects” do you use when doing media buys, are there specific websites you have in mind?  How would you define the variables used to “reach” your potential customers?

How it Works

Find Your Targeting Variables

Reverse segmentation starts by taking each of your demographic/targeting variables and finding out whether the categories (levels) of these targeting variables show differences on your attitudinal/behavioral variables.  For example, say that gender is one of your targeting variables.  Do men and women show differences on any/some/all of your attitudinal/behavioral variables?  If differences exist on a variety of attitudinal/behavioral variables, then save that targeting variable.

Once you’ve identified some targeting variables that show differences, you build a multi-way table that creates cells with every possible combination of differentiating targeting characteristics. For example, you might have cells made up of gender (2 levels) x age (5 levels) x having children in the home or not (2 levels) x frequency of visiting a certain website (4 levels), etc.  Each of these 2 x 5 x 2 x 4 = 80 targeting units forms the basis for further analysis.  Now group individual cases into the targeting unit where they fit.   Each unit represents people with the same targeting characteristics.  For example, one group might be: women, aged 25-34, with children in the home, who very frequently visit a certain website.

Score Target Targeting Units on Attitudinal/Behavioral Variables

Next, we find the average score for each of these targeting units on the measures of attitudes and behaviors, across all that cases that fall into the unit.  “Average” is a generalized word here, as it could be the mean or a percentage who give a particular response.

Cluster

Finally, we take these average scores and use clustering techniques to combine targeting units based on similarity in attitudes and behaviors.  The resulting segments have distinctive attitudinal/behavioral profiles, which is necessary for constructing a targeted message, while at the same time having clear-cut demographic attributes, which is necessary for reaching the people with the message.  Remember, the things we just clustered were not individuals but were targeting units made up of groups of people with distinct combinations of your targeting variables.

Iterate and Describe Your Segments

Look at a variety of clustering solutions, and consider using ensembling methods to come to a best solution.  Look for reasonable cluster sizes and differentiation.  Consider using Discriminant Function Analysis (DISCRIM) to help pull together themes of attitudinal/behavioral variables that differentiate clusters.

Classifying Future or Other Cases – with NO Misclassification

One of the common goals of a segmentation project is to come up with an algorithm that can be used to classify cases into segments based on a limited amount of information (so you don’t have to give a whole survey and do the whole segmentation analysis every time you want to classify people).

Recall that the segments in reverse segmentation are built on the “targeting units”, that are simply multi-way combinations of targeting variables.  Therefore, our classification “algorithm” simply finds cases that match the targeting unit and we know which segment the case is in.  Using this method, there is NO misclassification into segments.  Let me repeat that point – there is NO misclassification.  If you have a respondent’s data on the variables used to create the targeting units, you can put that respondent into its correct segment.

Maximizing the Utility of Reverse Segmentation

To maximize action that can be taken from the results of reverse segmentation, it can be helpful to focus on targeting variables that are currently in your database.  This approach allows you to flag people in your database as matching a particular behavioral or attitudinal type (segment).  Just because you can’t get survey data from everyone in your database, doesn’t mean you can’t classify them into a segment.

Don’t Forget the Basics

As with any segmentation output, the results from reverse segmentation should be evaluated as follows:

  • Are the segments meaningful?  Do they make sense?
  • Are the segments large enough to justify targeting them?
  • Are the segments reachable (e.g., via ads or direct sales)?
  • Are the segments uniquely responsive to marketing efforts? (This characteristic is evaluated over time).
  • Is the overall segmentation plan actionable?
  • Note that another quality sought in segments is that they are identifiable – reverse segmentation segments are always identifiable, given the way they are formed

Conclusion

The reality with any newer technique that includes a number of steps and is stats-heavy is that it can be difficult to understand your first (or first few) times.  However, if you can get over the hump with this technique, I think you’ll find a very powerful and useful tool that can greatly improve your ability to reach the right people with the right messages.

Customer Segmentation – An Overview

knifeThis information is useful for people who are interested in matching the best messages or products possible to those customers or prospects most likely to buy them.

Why Segmentation?

If you make a product, have a service, market or sell, customer segmentation is important to you. The days of one-size-fits-all approaches are long over, and especially in tough economic times, it is essential to be as efficient and effective as possible in getting the best version of your product or service in front of the people most likely to need it and buy it.

How?

There are many ways to break your customers into unique groups. Methods range from simple to complex to implement, and result in a variety of valuable information. All these methods should have the same goal – how can your company better create need for your products or services by targeting the people most likely to buy them with appropriate messaging or offerings.

Basic Demographic Splits

As simple as it gets, cutting your customers by a single demographic variable is done all the time. Think, “the pink one for the baby girls and the blue one for the baby boys”. This method is very common in B2B situations where the cut is company size. How many companies do you know that base sales and marketing programs around “small”, “medium” and “large” business targets? This method is primarily useful because of its intuitive nature, and is so commonly used because it takes nothing more than a customer database or access to Dunn and Bradstreet data. Of course, this method has its limitations. Are all your “small” customers really the same? Do they all behave the same way and want the same things? Probably not.

One common way to present these basic splits is in banner tables. A banner table highlights differences between segments on a number of other variables – be they other fields from a database or variables from a survey.

Let’s Take it to Two Levels

If a single split doesn’t cut it (forgive the pun :) , how about first cutting by something like company size and then by, say, industry? Fine enough. Here, you start to isolate more and more unique groups, and by better understanding their uniqueness, you can better reach them with appropriate and effective marketing messages and provide them the product types and features they most need.

And So On…

One could, in theory, continue to cut by more and more variables – company size, then industry, then job function, etc. (or for B2C: gender, then age, then income, etc.). However, the human brain can only handle looking at so many segments at once, let alone find differences, so the output becomes unwieldy. To get this greater level of detail, we need to take the stats up a notch.

Multivariate

Let’s start by making sure we all understand the meaning of this word: multivariate. Multi = many. Now, how about variate? Think “variable”. A variable is something that varies – the opposite of a constant. So, multivariate = many variables. That’s it – we’re looking at a number of variables, all at the same time.

Cluster/Latent Class Analysis

These analyses identify relatively homogeneous groups of cases (let’s say customers) based on selected characteristics (these are the variables). If you can imagine multidimensional space, you can wrap your head around these analyses. Imagine a star in space. Now imagine a cluster of stars – I’m not an astronomer, so work with me here ;-) You can look in the sky and see one group of stars that clump together over in one part of the sky, and another group that clump together in a different part of the sky. These analyses basically put your customers into homogeneous groups based on how “close” the customers are to each other. “Close”, in this case, takes into account how a customer looks on a number of variables (remember, multivariate). You end of with a group of customers that are like each other and are different than another group (cluster). Now you can see why we can only go so far with crosstabs – we just can’t capture the multivariate space without some better analytics.

Note that there are other analytic methods that can be used to segment customers (e.g., CHAID), however, for the purposes of this post, let’s focus on clustering methods.

Exploring Segmentation Solutions

These analyses are what we call “exploratory” techniques – in other words, to paraphrase Forrest Gump, “it’s like a box of chocolates – you never know what you’re going to get”. Here’s where the science and art of statistics merge (yes, don’t laugh, statistics can be artful). Between “many” and “one” cluster lies a potentially useful number of clusters where similarities exist within the clusters and differences exist between clusters. A quick note, beyond the scope of this post, that ensembling techniques (in essence, looking at multiple cluster solutions together to find one that is more stable) can help you come to a better solution. Also, here’s where you put on your business hat (or call in the business people if you don’t own that hat) to figure out how actionable the segmentation solution is.

What Makes a Good Segmentation?

I once had a client who felt that his market should have 10 or more segments – without looking at the data. I brought him 4 and 5 segment solutions – based on the data. Who was correct? Here are a few things to keep in mind when choosing a final segmentation solution:

  1. Distinct and Identifiable: Groups have to be different than each other on variables that you can measure now and in the future.
  2. Sizable: Groups have to be large enough that they are worth marketing/selling to
  3. Reachable: Groups can be identified in the market and targeted (note: this is a big issue that is often not achieved – see below)
  4. Stable: Groups need to look tomorrow like they look today (note: customers change over time, and segmentation solutions do get “stale” – when that happens, it’s time to run a new study)
  5. Profitable/Valuable: Groups that are reached act on the messages/products that they receive by purchasing (note: not all groups will fit into this category – you will identify some groups that will likely not buy – that’s good to know, so you don’t reach out to them.)
  6. Relevant: Groups are integrated with your larger marketing plan and make sense in the context of your strategic direction.

Criticisms of Segmentation

“I got these great-sounding segment names, but they don’t have distinct demographic targeting profiles, so I can’t reach them.” Making up cool names based on attitudinal and/or behavioral clustering can be a lot of fun, but if your segments aren’t unique on the variables you use to target them, then that doesn’t help. Note that this is more of a methodological issue than anything else (see below).

“My segmentation report just sits on the shelf”. Sad story – all too common. Sometimes, this outcome can’t be helped. I’ve seen VPs torpedo a good-looking segmentation solution because it didn’t match their preconceived notions. To give yourself the best chance of a solution being used in your organization, make sure you have clear objectives and buy-in from key stakeholders, and involve key people over the life of the project.

But Wait, There’s More…

One of the most exciting developments I’ve seen in segmentation is a technique called Reverse Segmentation. It’s important enough that it deserves its own post, so stay tuned. For the moment, I’ll say that it provides a solution for criticism #1 above.

Two Common Times NOT to Use Conjoint

This information is useful for people who want to understand a couple times when it is not appropriate to use conjoint analysis.

Conjoint analysis is a gold standard technique for measuring feature preference, particularly in relationship to price.  I’m particularly fond of Adaptive Choice-Based Conjoint (but that’s a topic for another post).  There’s a lot of buzz around conjoint as a tool to help product managers choose features that will help their products better compete in the marketplace, so I often get calls from companies thinking it would be a good idea to do a conjoint project.

In this post, I’ll show two common times when it is NOT appropriate to use conjoint analysis.

Definitions:

An “attribute” is something like brand, number of licenses, amount of storage, color, package size, etc.  A “level” is the degree of an attribute.  For example, brand A, B or C; 5, 10, or 20 licenses; 1, 2 or 3 TB of storage; blue, red, or black color; 12 ounce, 18 ounce, or 24 ounce package.

When NOT to Use Conjoint:

1. Your product features are already locked in – you just want to test prices. If your product is fully baked, you don’t want to use conjoint.  Conjoint is all about looking at the inter-relationship between various levels of product attributes and price.

If your product is locked in as a 10 license product with 1 TB of storage and other features set, conjoint is not for you.

So, how can you test price on your fully baked product?

Note: each of these methods deserves its own post, but here’s a taste.

Monadic designs:

  • Break your respondent sample into groups that each see a single price associated with the product and ask their likelihood to purchase.  Plot the probabilities against the prices.

The  van Westendorp Price Sensitivity Meter:

  • Ask “too inexpensive”, “inexpensive”, “expensive”, and “too expensive” questions.  Plot the data to obtain lower and upper bands and optimal price point.

The Newton-Miller-Smith variant of van Westendorp:

  • Add purchase probability follow-up questions based on the inexpensive and expensive answers.  Build consideration curves.

2. Your attributes don’t vary (don’t have levels) – you’re just testing preference/importance of a number of items. You are not looking at the inter-relationship of various levels of brand, size, quality, durability, package, price, etc.  Instead, you want to understand the importance of, or preference for, a number of features/attributes that each have a single (constant, not varying) level.  Perhaps you want to test the general importance of brand vs size vs quality, etc.  Or, you may want to understand the importance of the specific, fixed features that make up your product (e.g., is having 10 licenses more important than having 1 TB of storage or the other features that make up your product?).

So, how can you test preference/importance of these features?

Note: A full description of MaxDiff can be found on the Outsource Research website.

Maximum Difference Scaling (MaxDiff)

  • Force respondents to make trade-offs between (usually) 4 of your items at a time.  They indicate which item is most and least preferred (important, etc.).  The output yields all the items on a 100-point scale, where you can truly say that a given item is “twice” as preferred as another item with half its value.

Note: MaxDiff can be used to help reduce the number of attributes that you carry forward into a conjoint.  For example, if your product has a lot of potential features to test, it would be wise to reduce the number that you bring into conjoint, so that the respondent is not overwhelmed.  MaxDiff can show you the most important attributes, which can then be further explored in the conjoint.

Conclusion:

Conjoint analysis is a powerful technique that can help you configure your feature-price mix to create a product that will be most preferred by your market.  However, if your feature set is already locked and you just want to test prices, or if your attributes don’t have any variation (levels) to them, then conjoint is not for you and you’ll need other techniques to solve your research problem.

Online Survey Sample Is Not Clean Enough – Clean it Yourself

This information is useful for people who use panel sample for online surveys, and who want to make sure their survey data is truly clean.

Online Survey Panels Tell Us Their Panelists Are Clean

It’s hard to open a marketing magazine without seeing an ad from an online survey panel company proclaiming how clean and high quality their panel is.  A few years ago, this claim was a big deal – it was the Wild West of online survey panels, and buyers of sample had to be very careful as to who they worked with.  Today, however, most major online survey sample companies have adopted measures to get rid of professional respondents, prevent over-surveying, and make sure that respondents are who they say they are.  So, whether the sample is “true” or “pure”, or there’s “attention to detail”, most reputable panel companies are doing a decent job of giving those of us who field surveys a good product.

But Survey Data is Still Dirty

However, and here’s a big however, the data from most online surveys using panel sample still comes in with some dirty responses.  My research shows that between 1 and 5% of survey data from panel sample is garbage.  Garbage – throw it out; don’t bring it into your final dataset to analyze.  Sure, one can blame some of these dirty responses on frustrated respondents dealing with poor survey writing (bad questions, too long, etc.), but the fact remains that you had better clean that survey data before it goes in for analysis.

So, How Do I Clean the Data?

Here’s a plan you can use to clean your data.

When I say “flag” below, I mean that you create a new variable in your dataset next to the variable you are examining, and you place a “1” in a cell if the respondent’s case is flagged.

  1. Flag speeders. Look at time to completion and flag those respondents who took the survey in an unrealistically short time.  Check the median time to completion and establish rules that you feel comfortable with – I often flag those taking <1/3 of median time with a “1″ (“speeder”), and those taking < 1/4 of the median time with a “2″ (“super speeder”).  You might consider removing outliers (at the slow end) before calculating your median.
  2. Flag straightliners. If you having any grid/matrix questions, flag those respondents who gave the same response to every item (unless it makes sense that they could do so).
  3. Flag gibberish or garbage responses. If you have any open-ended responses, look for text such as “asdf” or “…..”; flag these responses, and any other “colorful, yet meaningless” responses you find.
  4. Flag incongruent combinations. If a respondent says their company size is 1000 and the number of PCs in the company is 5, something’s wrong here.  Flag it.
  5. Trap questions. Did you include any questions such as “Please choose the third response below”, or “Please type the word “attention” below”?  If you did, check them, and flag those respondents who didn’t follow the directions.
  6. Sum up your flags. Compute a new variable that sums all the flags.
  7. Sort your dataset by summed variable. Bring cases to the top that have suspicious answers on a number of your checks.
  8. Inspect and delete cases with flags. Delete those cases that are too “dirty” to be included.  Review with key stakeholders to agree on deletions.
  9. Notify your vendor of any bogus respondents. All the vendors I work with do not charge for any respondents I have flagged for deletion.  Show them the IDs of the respondents you threw out, and they’ll take action on their side to warn and/or remove these panelists from their database.

Following the steps above will insure that the data you analyze is as clean as possible.  Yes, it takes a bit of time, but the effort is clearly worth it when compared to making decisions based on the analysis of data that includes bogus responses.

One last note: if you really need your final sample size to hit a specific number, and you can’t go below that number, you can over-sample, in anticipation of throwing out some respondents.

Feel free to contact me for more details about some of the specific techniques I have found useful to clean data, or follow me on Twitter @NicoPeruzziPhD to hear about other marketing research topics.