Two Common Times NOT to Use Conjoint

This information is useful for people who want to understand a couple times when it is not appropriate to use conjoint analysis.

Conjoint analysis is a gold standard technique for measuring feature preference, particularly in relationship to price.  I’m particularly fond of Adaptive Choice-Based Conjoint (but that’s a topic for another post).  There’s a lot of buzz around conjoint as a tool to help product managers choose features that will help their products better compete in the marketplace, so I often get calls from companies thinking it would be a good idea to do a conjoint project.

In this post, I’ll show two common times when it is NOT appropriate to use conjoint analysis.

Definitions:

An “attribute” is something like brand, number of licenses, amount of storage, color, package size, etc.  A “level” is the degree of an attribute.  For example, brand A, B or C; 5, 10, or 20 licenses; 1, 2 or 3 TB of storage; blue, red, or black color; 12 ounce, 18 ounce, or 24 ounce package.

When NOT to Use Conjoint:

1. Your product features are already locked in – you just want to test prices. If your product is fully baked, you don’t want to use conjoint.  Conjoint is all about looking at the inter-relationship between various levels of product attributes and price.

If your product is locked in as a 10 license product with 1 TB of storage and other features set, conjoint is not for you.

So, how can you test price on your fully baked product?

Note: each of these methods deserves its own post, but here’s a taste.

Monadic designs:

  • Break your respondent sample into groups that each see a single price associated with the product and ask their likelihood to purchase.  Plot the probabilities against the prices.

The  van Westendorp Price Sensitivity Meter:

  • Ask “too inexpensive”, “inexpensive”, “expensive”, and “too expensive” questions.  Plot the data to obtain lower and upper bands and optimal price point.

The Newton-Miller-Smith variant of van Westendorp:

  • Add purchase probability follow-up questions based on the inexpensive and expensive answers.  Build consideration curves.

2. Your attributes don’t vary (don’t have levels) – you’re just testing preference/importance of a number of items. You are not looking at the inter-relationship of various levels of brand, size, quality, durability, package, price, etc.  Instead, you want to understand the importance of, or preference for, a number of features/attributes that each have a single (constant, not varying) level.  Perhaps you want to test the general importance of brand vs size vs quality, etc.  Or, you may want to understand the importance of the specific, fixed features that make up your product (e.g., is having 10 licenses more important than having 1 TB of storage or the other features that make up your product?).

So, how can you test preference/importance of these features?

Note: A full description of MaxDiff can be found on the Outsource Research website.

Maximum Difference Scaling (MaxDiff)

  • Force respondents to make trade-offs between (usually) 4 of your items at a time.  They indicate which item is most and least preferred (important, etc.).  The output yields all the items on a 100-point scale, where you can truly say that a given item is “twice” as preferred as another item with half its value.

Note: MaxDiff can be used to help reduce the number of attributes that you carry forward into a conjoint.  For example, if your product has a lot of potential features to test, it would be wise to reduce the number that you bring into conjoint, so that the respondent is not overwhelmed.  MaxDiff can show you the most important attributes, which can then be further explored in the conjoint.

Conclusion:

Conjoint analysis is a powerful technique that can help you configure your feature-price mix to create a product that will be most preferred by your market.  However, if your feature set is already locked and you just want to test prices, or if your attributes don’t have any variation (levels) to them, then conjoint is not for you and you’ll need other techniques to solve your research problem.

Imminent Predictions and Eminent Results

When will Market Research yield quickly actionable insights? In other words, how can MR transform itself into an “internet time” discipline?

I am asking Research Access readers to offer opinions on how we need to change the structures of the discipline.

Questions to ponder:

1. How do we take a field defined by “discrete bursts of production” to one in which we are offering continuous insights?

2. When will MR be “forward looking” instead of “historical?”

3. How do we put MR in the center of all of Marketing?

If we can produce imminent predictions (based on data but also on “gut”) then we’ll produce eminent results. Then we can be feted and not thought of as “another cost.”

Trymyui – The next generation of mass usability testing?

Trymyui, a new service offering remote usability testing, just hit the scene this spring. TryMyUI is backed by Sani El-Fishawy who started Classifieds2000.com – back in 1997.  Classifields2000 was one of the internet startups back in the late 90′s – they powered the classifields section of many major portals of the day – Lycos, Excite, Infoseek, GeoCities and even Hotmail. In 1997 they raised their first round of $2.8M led by Polaris Venture Partners and then sold the company to Excite for $48M in 1998 — Ahah the good old days of the .COM boom!

What is TryMyUI?

Trymyui is a service that allows companies to understand their web users better. Any company that wants to understand not only what a user would do on their website, but also why they would do it should visit Trymyui to set up an account and receive a prompt reply. Trymyui assembles and trains testers who have to go through a rigorous qualification process (less than 10% actually make it). The testers are trained to articulate their thoughts while they browse a company’s website, performing various tasks. Trymyui records the tester’s screen and voice and delivers a video to the company as well as responses to a questionnaire that the company sets up within a matter of hours.

Why is it unique?

The Trymyui website is spiffy, the tone of their content is candid, and their service is incredibly useful. It was easy to find answers to my questions as well as understand those answers once I found them. But what makes me excited about Trymyui is that it encourages companies to go further with their user testing – any user testing firm that encourages deeper interaction with the user understands that traditional market research is no longer enough.

What is the business model – aka is this real?

TryMyUI is currently in beta, so its obviously free. They plan on charging anywhere between $10 to $25 per  test. Comparing this to a traditional in-person usability testing model, this is what we call in the business “a disruptive” model. The reason this can be provided at such low a price is that the testers are at their homes – so there are no capital costs involving setup of computers etc. – Anyone with a microphone and a decent ability to verbalize their thoughts as they are clicking through a site can be a tester.

Challenges / Naysayers?

Couple:

  1. The obvious security and confidentiality issue. If you are developing a super-secret application and you don’t want anyone to run though the app – well – we have a little issue here. I don’t see Microsoft using this to do usability testing on a brand new product line. Maybe a private version of TryMyUI with a controlled group would be an add-on in the future?
  2. Scale and Pricing – Can you really scale this business to something large at $20/Test – Maybe. I am not sure. Of course time will be the judge. This is very applicable to Web based applications – so thats a fairly large market – but still a finite market size. Furthermore, the mobile application market is growing and in theory taking away from the web-app development market.

YouTube Leads Online Video

Internet users continue to spend more and more time surfing online video. comScore released the results of their May 2010 U.S. Online Video Rankings which showed that 183 million U.S internet users watched some form of online video over the course of May. It also showed that YouTube (among other Google properties) accounted for 43.5% of the market.

This is good news for YouTube, of course, but it also means that more and more eyeballs are looking to the internet for another video viewing source. In fact, the study also showed that nearly every internet user is now watching some form of online video (85% of the total U.S. internet audience). That’s a good chunk of the digital audience. Hopefully marketers are paying attention to this trend and adding it to their digital strategy.

Thumbspeak – Mobile Panel Platform – Dean Wiltse gets back into the game.

After being in the Market Research business for 10 years and leading Greenfield Online public and then overseeing the merger/buyout of Perseus and Websurveyor into Vovici – Dean Wiltse is back in the game – with ThumbSpeak.

As mentioned in my previous post, we’ll be profiling innovative companies that are attempting to change Market Research and data collection and looking beyond traditional models. This is our first post on this.

What is ThumbSpeak?

A mobile panel. Where users _actually_ take surveys using smartphones (iPhone, Android devices etc.)  Users download the app – and the App then “pushes” surveys to consumers. Consumers get points/rewards (like with most other panels) for completing surveys. Thumbspeak sells access to these users to market researchers who want to tap into them.

Why is it unique?

Traditional panel providers like eRewards, Toluna, SSI have mobile panels – as in they have users who have (or have used) a smart phone – but the surveys are still conducted over the traditional web interface. Thumbspeak is solely focused on the mobile platform. All their data-collection efforts are going to be in the smartphone — at least for now. They have an iPhone App out – the Android app will come out soon in the next few months.

Next – they are recruiting via developers. There are over 200,000 apps in the App-Store – and most of them are Free. Which means there are approximately 100,000 – 200,000 developers developing for the Apple eco-system alone. If you add in the Android platform and the Symbian/Palm OS platfrom – there are a LOT of developers trying to make money off smartphones. Not all can make money. They need a better monetization model than simply ads. Ad’s in mobile work to some extent, but because of real-estate restrictions – you cannot make a ton of money just on Ads. Thumbspeak has the ability to make these developers some cash – but being the conduit between users who want to pay, who need access to mobile users and want to conduct market research, and the development community.

What’s the business model – aka – is this real?

So far it seems. The typical going rate for an iPhone completed panel member is ~$6/Completed Survey. This revenue can be distributed between the developer and Thumbspeak – and given the fact that Wiltse has a deep background in the panel business from GreenField – I don’t doubt that he can sell panels! Wiltse was the CEO of GreenField with a 400M market cap. He also has a good background in software development when he led Vovici and oversaw the Perseus/Websurveyor merger.

Challenges/NaySayers?

Couple :

  1. Critical Mass – Any panel company requires critical mass – its the chicken and the egg problem. Wiltse will have to overcome that both from a sales and recruitment standpoint.
  2. Competition – Toluna, SSI, GMI and eRewards – have a combined market cap of about a 500-800M – So you are talking about the big boys here. If they smell blood – they’ll come after this – I would not say this is that bad – very likely one of them just buys Thumbspeak?

Yeah well WE told you so

If in Vienna Talleyrand said “Europe, unhappy Europe” then today we should say “Ad-based digital publisher, unhappy ad-based digital publisher.”

I mean come on. We predicted this on Research Access months ago.

Read this from adage today:

http://adage.com/digital/article?article_id=144884

And read this from Research Access from APRIL!

http://researchaccess.com/2010/04/the-enduring-cpm-and-its-discontents/

Pay special attention to this paragraph:

So while the CPM is not dead, most publishers are slowly killing the C. In attempting to relentlessly expand their audiences, they hurt their own businesses and simultaneously provide watered-down coverage for their advertisers.

Hey industry, tsk tsk!

CrowdSolving – Beyond CrowdSourcing?

I’m not very convinced of the “wisdom of crowds.” There are numerous examples of how “the wisdom of crowds” is in fact the “idiocy of the mob.” Look at some political movements or some of the more extreme religions, for instance: a good few of these make no sense, but they have a lot of people who believe them. In Vanatu, an island in the Pacific, there is a cargo cult called the John Frum Cult that thinks building replicas of USA air force bases from World War II will bring the USA and all their goods back to the island. A lot of people believe this.

There is a lot of research from social psychology showing that groups polarize decisions in contrast to individuals. A group will make a more extreme decision (cautious or risky) than an individual. There is also the fact that estimations of physical sizes and weights will tend to show a normal distribution, with the most common estimate, the mode, being the correct one. Here there is wisdom in crowds, or more likely the wisdom of the normal distribution, the central limit theorem and statistics in general. Distributions are wonderful things.

One of the advantages of a large scale survey is that you are able to leverage a lot of people’s experience and knowledge. Recently, a company called “Netflix” in the USA utilized the web and their subscriber base to solve an interesting problem. While it is not the usual meaning of the term the “wisdom of crowds,” it is an example of how a crowd can solve a problem. Netflix (www.netflix.com) rents DVDs to their subscribers. They send the rentals via mail and their users maintain a list of which DVD’s they want. Netflix also tries to predict which DVDs people might like to watch based on the DVDs they have already rented. Amazon does a similar thing in making product recommendations to purchasers. Netflix wanted to improve their predictive algorithm by 10%, which is quite a large improvement. They could have tried to hire all sorts of geniuses, but they instead chose a very unique way to solve the problem. They set up a web site (www.netflixprize.com), posted a huge data set of movie DVDs, data about those movies, and subscriber choices. They then offered $1,000,000 to anyone who could improve their algorithm by 10%. There were two conditions: a deadline (September of 2009) and an agreement that anyone who submitted a solution had to document that solution publicly. Many companies allowed their employees to set up teams and compete, some individuals competed, and teams merged and re-formed over time. In the end there was a winning team: Bellkors Pragmatic Chaos.

In this case the wisdom was not “crowd think,” whatever that is. Instead, Netflix leveraged the web and all the people surfing it to source people who wanted to solve this problem. For Netflix, the $1,000,000 was cheap. They could never have afforded to hire all the people who took part in the contest. They got access to world-class computing facilities, superior minds, and  they received some great publicity as well.

The winning algorithm was a technique called a “Restricted Boltzmann Machine.” It proved that numbers and math matter. It wasn’t the crowd that solved the problem, but the crowd was the mechanism that made the solution possible. I’m inclined to think that this is the real wisdom of the crowd. People can come up with all sorts of strange beliefs; the ability to get people to address your problem is the wisdom of the crowd. It’s another example of how the web has changed the world in a radical way. Twenty years ago, it simply would not have been possible for Netflix to find a solution to their problem so gracefully. I hear there is going to be another Netflix contest. It’s nice that it was the math that was wise in the end….

Don’t say “Sorry, we didn’t ask that,” instead let the people shine through

It’s among the most uncomfortable moments a market researcher can have.  You’re standing in front of clients presenting the results of a study.  All eyes are fixed on you.  They’re listening to your every word.  And an unexpected question from someone in the audience gets put to you on what the study has learned about a particular topic.   But the study didn’t ask about that topic.  It wasn’t included in the questionnaire.  You may be tempted to take the easy way out and put the responsibility back on your client and remind them they had a chance to raise that issue back when the study was being designed.  That would be a way to get out from underneath the unexpected question that many researchers would take.   But there’s another answer that will make you look like a star, and not somebody who squirms away from unexpected questions.

And that answer goes something like this: “We listened to the comments of 400 people when they were asked their thoughts on three different areas very much related to your question.  They had plenty of opportunities to raise that same issue (or concern) a number of times and it was mentioned by just two people.  And here’s exactly what those two people had to say…”    This answer gets the client some good feedback on their issue and it prevents you from looking like a researcher who missed the mark on something.

The way to be prepared to give this answer does not require methodological genius.  It requires writing some open end questions that are just specific enough to stay on target, but not so specific that a respondent doesn’t get to say what’s on their mind.  It also requires spending a few hours going through all the open end responses.    Yes, it requires getting into the nuts and bolts, into the weeds, into the details.  But before you claim your time is too valuable for such a chore, consider that spending time with verbatim comments will prepare you for a presentation in ways beyond how the numbers can prepare you.

Verbatims do even more

You’ll find that after you are armed with many particularly insightful verbatim comments that your presentation of the data will take on far greater impact as well.  It is very good advice to say that the way to present data is to tell a story about what it all means.  Nobody wants to be subjected to page after page of just numbers.  And your story about what it all means will be a much better story if it is peppered with timely and relevant real comments from real people.  You’ll have much more confidence that you really know what respondents have to say about the study’s objectives and what they have to say about plenty of other things too.  In short, you’ll be much better prepared to deliver an outstanding presentation.

And as far as a deliverable to a client, I’m not talking about a collection of just a couple dozen comments.    It should be a thick stack of many hundreds (perhaps thousands) of comments from a quantitative study.  But you can’t just present a big collection of random comments.   They need to be organized by category and quantified.   That’s right – quantified.  How many people said this type of comment, and how many people said that.  About 50 different categories of comments is typical.   Focus groups can’t do that reliably enough to make statistically sound projections to a larger population.   And if you do the coding and classifying yourself you’ll be left with a supplemental file to your presentation deck that is much more valuable than a few pages in your stack of data tabulations that say “open ends.”  After all, those tables reduce the rich and often colorful verbatim comments to a listing of categories just a few words long.  Those tables most definitely do not provide the same perspective as getting into actual respondent words to hear for yourself.

Watch how they are drawn to it

When preparing for a client presentation you probably have painstakingly built slides that expertly display your statistical prowess.  Many researchers like to show off conjoint analysis and regression models.  It makes all that time studying statistics pay off.   (I’m no different.  I like to take a client through cluster analysis.  And I love to hear questions like “why did you decide to do discrete choice analysis?”)  But if you look around the room after a presentation is over, you may notice something interesting.  Among the back-up materials that were created to support the main presentation deck, it is the collection of actual respondent comments that everybody wants to pore through.   After all, this is what real people have to say in real words.  It is a level of communication that many people need to experience before they can be truly persuaded.   You’ll see how one corporate VP will peruse the verbatims for just a minute or two and then nudge another corporate VP and say ‘listen to this one.’   I’ve even known of a client who didn’t bother to keep and file his copy of the summary report, but he was careful to keep his copy of the verbatim comments from customers.   He wanted to feel a closer connection to his company’s customers.

So the client who likes focus groups because he/she needs to hear it from the horse’s mouth will be satisfied and the client who is only convinced by quantitative proof will also be satisfied.   So a quant study has an element of qualitative findings, and those qualitative findings have an element of quantitative analysis.  The goal is to make the presentation and analysis come alive and deliver the best of what both types of research can deliver.

Poor question design means questionable results: A tale of a confusing scale

I saw the oddest question in a survey the other day. The question itself wasn’t that odd, but the options for responses were very strange to me.

* 1 – Not at all Satisfied
* 2 – Not at all Satisfied
* 3 – Not at all Satisfied
* 4 – Not at all Satisfied
* 5 – Not at all Satisfied
* 6 – Not at all Satisfied
* 7 – Somewhat Satisfied
* 8 – Somewhat Satisfied
* 9 – Highly Satisfied
* 10 – Highly Satisfied

What’s this all about?  As a survey taker I’m confused.  The question has a 10 point scale, but why does every numeric point have text (anchors). What’s the difference between 1, 2, 3, 4, 5 and 6 that all have the same anchoring text?   Don’t they care about the difference between 3 and 5?  Oh, I get it, this is really a 3 point scale disguised as a 10 point scale.

With these and other variations on the theme of “what were the survey authors thinking?”  on my mind I talked to a representative from the sponsoring company, AOTMP.  I was told that the question design was well-thought out and appropriate, being modeled on the well-known Net Promoter Score.   Well of course it is  – like an apple is based on an orange (both grow on trees).  But not really:

1. The Net Promoter question is for Recommendation, not Satisfaction.  There were a couple of other similar questions in the short survey, but nothing about Recommendation. Frederick Reichheld’s contention is that recommendation is the important measure and also incorporates satisfaction; you won’t recommend unless you are satisfied.
2. The NPS question uses descriptive text only at the end points (Extremely Unlikely to Recommend and Extremely Likely to Recommend).  It is part of the methodology to avoid text anywhere in the middle in order to give the survey taker the maximum flexibility.  That’s consistent with survey best practices.
3. The original NPS scale is from 0 to 10, not 1 to 10.  Maybe that’s a small point, although the 0 to 10 scale does allow for a midpoint which was part of the the NPS philosophy.

Other than the fact that this survey question isn’t NPS, what’s the big deal?  Well, this pseudo 10 point scale really doesn’t work.  The survey taker is likely to be confused about whether there is any difference between “3, Not at all Satisfied” and “4, Not at all Satisfied”. Perhaps the intention was to make it easier for survey takers, but either they’ll take more time worrying about the meaning, or just give an unthinking answer, and the survey administrator has no way of knowing.  Why not just use the 3 point scale instead?  I suppose you could, but then it would be even less like NPS. Personally, I like the longer scale for NPS.  I don’t use NPS on its own very much, but the ability to combine with other satisfaction measures with longer scales (Overall Satisfaction and Likelihood to Reuse) means that I’ve got the option of doing more powerful analysis as well as the simple NPS.  More importantly, I don’t have to try to persuade a client to stop using NPS as long as I include other questions using the same scale.  Ideally, I’d prefer to use a 7 or 5 point scale instead, but 10 or 11 points works fine – as long as only the end-points are anchored. For more on combining Net Promoter with other questions for more powerful analysis, check out “Profiting from customer satisfaction and loyalty research”

There’s no justification for this type of scale in my opinion.  If you disagree, please make a comment or send me a note.   If you want to use a scale with every point textually anchored, use the Likert scale with every point identified (but no numbers). Including both numbers and too many anchors will make the survey takers scratch their heads – not the goal for a good survey.

Perhaps the people who created this survey had read economist J.K. Galbraith’s comment without realizing it was sarcastic.- “It is a far, far better thing to have a firm anchor in nonsense than to put out on the troubled seas of thought.”

Online Survey Sample Is Not Clean Enough – Clean it Yourself

This information is useful for people who use panel sample for online surveys, and who want to make sure their survey data is truly clean.

Online Survey Panels Tell Us Their Panelists Are Clean

It’s hard to open a marketing magazine without seeing an ad from an online survey panel company proclaiming how clean and high quality their panel is.  A few years ago, this claim was a big deal – it was the Wild West of online survey panels, and buyers of sample had to be very careful as to who they worked with.  Today, however, most major online survey sample companies have adopted measures to get rid of professional respondents, prevent over-surveying, and make sure that respondents are who they say they are.  So, whether the sample is “true” or “pure”, or there’s “attention to detail”, most reputable panel companies are doing a decent job of giving those of us who field surveys a good product.

But Survey Data is Still Dirty

However, and here’s a big however, the data from most online surveys using panel sample still comes in with some dirty responses.  My research shows that between 1 and 5% of survey data from panel sample is garbage.  Garbage – throw it out; don’t bring it into your final dataset to analyze.  Sure, one can blame some of these dirty responses on frustrated respondents dealing with poor survey writing (bad questions, too long, etc.), but the fact remains that you had better clean that survey data before it goes in for analysis.

So, How Do I Clean the Data?

Here’s a plan you can use to clean your data.

When I say “flag” below, I mean that you create a new variable in your dataset next to the variable you are examining, and you place a “1” in a cell if the respondent’s case is flagged.

  1. Flag speeders. Look at time to completion and flag those respondents who took the survey in an unrealistically short time.  Check the median time to completion and establish rules that you feel comfortable with – I often flag those taking <1/3 of median time with a “1″ (“speeder”), and those taking < 1/4 of the median time with a “2″ (“super speeder”).  You might consider removing outliers (at the slow end) before calculating your median.
  2. Flag straightliners. If you having any grid/matrix questions, flag those respondents who gave the same response to every item (unless it makes sense that they could do so).
  3. Flag gibberish or garbage responses. If you have any open-ended responses, look for text such as “asdf” or “…..”; flag these responses, and any other “colorful, yet meaningless” responses you find.
  4. Flag incongruent combinations. If a respondent says their company size is 1000 and the number of PCs in the company is 5, something’s wrong here.  Flag it.
  5. Trap questions. Did you include any questions such as “Please choose the third response below”, or “Please type the word “attention” below”?  If you did, check them, and flag those respondents who didn’t follow the directions.
  6. Sum up your flags. Compute a new variable that sums all the flags.
  7. Sort your dataset by summed variable. Bring cases to the top that have suspicious answers on a number of your checks.
  8. Inspect and delete cases with flags. Delete those cases that are too “dirty” to be included.  Review with key stakeholders to agree on deletions.
  9. Notify your vendor of any bogus respondents. All the vendors I work with do not charge for any respondents I have flagged for deletion.  Show them the IDs of the respondents you threw out, and they’ll take action on their side to warn and/or remove these panelists from their database.

Following the steps above will insure that the data you analyze is as clean as possible.  Yes, it takes a bit of time, but the effort is clearly worth it when compared to making decisions based on the analysis of data that includes bogus responses.

One last note: if you really need your final sample size to hit a specific number, and you can’t go below that number, you can over-sample, in anticipation of throwing out some respondents.

Feel free to contact me for more details about some of the specific techniques I have found useful to clean data, or follow me on Twitter @NicoPeruzziPhD to hear about other marketing research topics.