About · Advertise

Research Access header image 4

Nico Peruzzi

Partner - Outsource Research Consulting

Nico Peruzzi, PhD is a partner with Outsource Research Consulting, a provider of quantitative research and high-end analytics.

 ·   ·  nperuzzi [at] orconsulting [dot] com

Nico Peruzzi, PhD is a partner with Outsource Research Consulting (www.orconsulting.com), a provider of quantitative research and high-end analytics. He has provided consultation on all aspects of the research cycle to organizations of wide ranging size and industry. Conjoint analysis, database analytics, data mining, segmentation analysis and predictive modeling are some of his areas of expertise. His clients include Herman Miller, Cisco WebEx, TiVo, Data Robotics, and various smaller businesses, research companies, and consultants small and large. Dr. Peruzzi has a BA in Biological Sciences from UC Santa Barbara, graduating Summa Cum Laude, Phi Beta Kappa, and Golden Key National Honor Society. He holds his MS and Ph.D. in Psychology from Pacific Graduate School (now Palo Alto University).

Reverse Segmentation – The Promised Land for Target Marketers?

September 2nd, 2010 by Nico Peruzzi · Research, essay

This information is useful for people who are tired of segmentation projects that give only unique attitudinal/behavioral segment descriptions or unique demographic/targeting profiles, but not both.

In my last blog post, I gave a general overview of customer segmentation, and I left off with a teaser about reverse segmentation.  Well, here goes.

A Common Criticism of Segmentation

To repeat what I said in my last post, “I got these great-sounding segment names, but they don’t have distinct demographic targeting profiles, so I can’t reach them.”   Traditional attitudinal/behavioral segmentations do a good job of identifying meaningful groups, but targeting the market can be difficult.  Alternatively, demographically-defined segments can be targeted but are often indistinct attitudinally or behaviorally, so they are difficult to “message to”.

Reverse Segmentation Solves These Problems

Reverse segmentation helps the market researcher identify market segments with highly differentiated attitudes and behaviors, while at the same time considering the demographics/ firmographics, media usage, or channel usage information that is needed to reach people and deliver a targeted message.

Before You Begin

I’m not going to spend much time discussing data collection for your segmentation analysis, but of course, you want to have attitudinal and/or behavioral data (probably from a survey, and if you’re lucky, tied to actual behavioral data) and targeting variables (demographics, media habits, etc.) that you hopefully already have in your CRM database.  Ask yourself what attitudinal/behavioral variables would help you describe your customers or prospects – what messages resonate with them, what do they currently own/buy, what are their interests, etc.?  Then, ask yourself how you target, or could target, possible customers – what “selects” do you use when doing media buys, are there specific websites you have in mind?  How would you define the variables used to “reach” your potential customers?

How it Works

Find Your Targeting Variables

Reverse segmentation starts by taking each of your demographic/targeting variables and finding out whether the categories (levels) of these targeting variables show differences on your attitudinal/behavioral variables.  For example, say that gender is one of your targeting variables.  Do men and women show differences on any/some/all of your attitudinal/behavioral variables?  If differences exist on a variety of attitudinal/behavioral variables, then save that targeting variable.

Once you’ve identified some targeting variables that show differences, you build a multi-way table that creates cells with every possible combination of differentiating targeting characteristics. For example, you might have cells made up of gender (2 levels) x age (5 levels) x having children in the home or not (2 levels) x frequency of visiting a certain website (4 levels), etc.  Each of these 2 x 5 x 2 x 4 = 80 targeting units forms the basis for further analysis.  Now group individual cases into the targeting unit where they fit.   Each unit represents people with the same targeting characteristics.  For example, one group might be: women, aged 25-34, with children in the home, who very frequently visit a certain website.

Score Target Targeting Units on Attitudinal/Behavioral Variables

Next, we find the average score for each of these targeting units on the measures of attitudes and behaviors, across all that cases that fall into the unit.  “Average” is a generalized word here, as it could be the mean or a percentage who give a particular response.

Cluster

Finally, we take these average scores and use clustering techniques to combine targeting units based on similarity in attitudes and behaviors.  The resulting segments have distinctive attitudinal/behavioral profiles, which is necessary for constructing a targeted message, while at the same time having clear-cut demographic attributes, which is necessary for reaching the people with the message.  Remember, the things we just clustered were not individuals but were targeting units made up of groups of people with distinct combinations of your targeting variables.

Iterate and Describe Your Segments

Look at a variety of clustering solutions, and consider using ensembling methods to come to a best solution.  Look for reasonable cluster sizes and differentiation.  Consider using Discriminant Function Analysis (DISCRIM) to help pull together themes of attitudinal/behavioral variables that differentiate clusters.

Classifying Future or Other Cases – with NO Misclassification

One of the common goals of a segmentation project is to come up with an algorithm that can be used to classify cases into segments based on a limited amount of information (so you don’t have to give a whole survey and do the whole segmentation analysis every time you want to classify people).

Recall that the segments in reverse segmentation are built on the “targeting units”, that are simply multi-way combinations of targeting variables.  Therefore, our classification “algorithm” simply finds cases that match the targeting unit and we know which segment the case is in.  Using this method, there is NO misclassification into segments.  Let me repeat that point – there is NO misclassification.  If you have a respondent’s data on the variables used to create the targeting units, you can put that respondent into its correct segment.

Maximizing the Utility of Reverse Segmentation

To maximize action that can be taken from the results of reverse segmentation, it can be helpful to focus on targeting variables that are currently in your database.  This approach allows you to flag people in your database as matching a particular behavioral or attitudinal type (segment).  Just because you can’t get survey data from everyone in your database, doesn’t mean you can’t classify them into a segment.

Don’t Forget the Basics

As with any segmentation output, the results from reverse segmentation should be evaluated as follows:

  • Are the segments meaningful?  Do they make sense?
  • Are the segments large enough to justify targeting them?
  • Are the segments reachable (e.g., via ads or direct sales)?
  • Are the segments uniquely responsive to marketing efforts? (This characteristic is evaluated over time).
  • Is the overall segmentation plan actionable?
  • Note that another quality sought in segments is that they are identifiable – reverse segmentation segments are always identifiable, given the way they are formed

Conclusion

The reality with any newer technique that includes a number of steps and is stats-heavy is that it can be difficult to understand your first (or first few) times.  However, if you can get over the hump with this technique, I think you’ll find a very powerful and useful tool that can greatly improve your ability to reach the right people with the right messages.

About Nico Peruzzi - Nico Peruzzi, PhD is a partner with Outsource Research Consulting, a provider of quantitative research and high-end analytics.

→ 2 CommentsTags:

Customer Segmentation – An Overview

August 16th, 2010 by Nico Peruzzi · Research, essay

This information is useful for people who are interested in matching the best messages or products possible to those customers or prospects most likely to buy them.

Why Segmentation?

If you make a product, have a service, market or sell, customer segmentation is important to you. The days of one-size-fits-all approaches are long over, and especially in tough economic times, it is essential to be as efficient and effective as possible in getting the best version of your product or service in front of the people most likely to need it and buy it.

How?

There are many ways to break your customers into unique groups. Methods range from simple to complex to implement, and result in a variety of valuable information. All these methods should have the same goal – how can your company better create need for your products or services by targeting the people most likely to buy them with appropriate messaging or offerings.

Basic Demographic Splits

As simple as it gets, cutting your customers by a single demographic variable is done all the time. Think, “the pink one for the baby girls and the blue one for the baby boys”. This method is very common in B2B situations where the cut is company size. How many companies do you know that base sales and marketing programs around “small”, “medium” and “large” business targets? This method is primarily useful because of its intuitive nature, and is so commonly used because it takes nothing more than a customer database or access to Dunn and Bradstreet data. Of course, this method has its limitations. Are all your “small” customers really the same? Do they all behave the same way and want the same things? Probably not.

One common way to present these basic splits is in banner tables. A banner table highlights differences between segments on a number of other variables – be they other fields from a database or variables from a survey.

Let’s Take it to Two Levels

If a single split doesn’t cut it (forgive the pun :) , how about first cutting by something like company size and then by, say, industry? Fine enough. Here, you start to isolate more and more unique groups, and by better understanding their uniqueness, you can better reach them with appropriate and effective marketing messages and provide them the product types and features they most need.

And So On…

One could, in theory, continue to cut by more and more variables – company size, then industry, then job function, etc. (or for B2C: gender, then age, then income, etc.). However, the human brain can only handle looking at so many segments at once, let alone find differences, so the output becomes unwieldy. To get this greater level of detail, we need to take the stats up a notch.

Multivariate

Let’s start by making sure we all understand the meaning of this word: multivariate. Multi = many. Now, how about variate? Think “variable”. A variable is something that varies – the opposite of a constant. So, multivariate = many variables. That’s it – we’re looking at a number of variables, all at the same time.

Cluster/Latent Class Analysis

These analyses identify relatively homogeneous groups of cases (let’s say customers) based on selected characteristics (these are the variables). If you can imagine multidimensional space, you can wrap your head around these analyses. Imagine a star in space. Now imagine a cluster of stars – I’m not an astronomer, so work with me here ;-) You can look in the sky and see one group of stars that clump together over in one part of the sky, and another group that clump together in a different part of the sky. These analyses basically put your customers into homogeneous groups based on how “close” the customers are to each other. “Close”, in this case, takes into account how a customer looks on a number of variables (remember, multivariate). You end of with a group of customers that are like each other and are different than another group (cluster). Now you can see why we can only go so far with crosstabs – we just can’t capture the multivariate space without some better analytics.

Note that there are other analytic methods that can be used to segment customers (e.g., CHAID), however, for the purposes of this post, let’s focus on clustering methods.

Exploring Segmentation Solutions

These analyses are what we call “exploratory” techniques – in other words, to paraphrase Forrest Gump, “it’s like a box of chocolates – you never know what you’re going to get”. Here’s where the science and art of statistics merge (yes, don’t laugh, statistics can be artful). Between “many” and “one” cluster lies a potentially useful number of clusters where similarities exist within the clusters and differences exist between clusters. A quick note, beyond the scope of this post, that ensembling techniques (in essence, looking at multiple cluster solutions together to find one that is more stable) can help you come to a better solution. Also, here’s where you put on your business hat (or call in the business people if you don’t own that hat) to figure out how actionable the segmentation solution is.

What Makes a Good Segmentation?

I once had a client who felt that his market should have 10 or more segments – without looking at the data. I brought him 4 and 5 segment solutions – based on the data. Who was correct? Here are a few things to keep in mind when choosing a final segmentation solution:

  1. Distinct and Identifiable: Groups have to be different than each other on variables that you can measure now and in the future.
  2. Sizable: Groups have to be large enough that they are worth marketing/selling to
  3. Reachable: Groups can be identified in the market and targeted (note: this is a big issue that is often not achieved – see below)
  4. Stable: Groups need to look tomorrow like they look today (note: customers change over time, and segmentation solutions do get “stale” – when that happens, it’s time to run a new study)
  5. Profitable/Valuable: Groups that are reached act on the messages/products that they receive by purchasing (note: not all groups will fit into this category – you will identify some groups that will likely not buy – that’s good to know, so you don’t reach out to them.)
  6. Relevant: Groups are integrated with your larger marketing plan and make sense in the context of your strategic direction.

Criticisms of Segmentation

“I got these great-sounding segment names, but they don’t have distinct demographic targeting profiles, so I can’t reach them.” Making up cool names based on attitudinal and/or behavioral clustering can be a lot of fun, but if your segments aren’t unique on the variables you use to target them, then that doesn’t help. Note that this is more of a methodological issue than anything else (see below).

“My segmentation report just sits on the shelf”. Sad story – all too common. Sometimes, this outcome can’t be helped. I’ve seen VPs torpedo a good-looking segmentation solution because it didn’t match their preconceived notions. To give yourself the best chance of a solution being used in your organization, make sure you have clear objectives and buy-in from key stakeholders, and involve key people over the life of the project.

But Wait, There’s More…

One of the most exciting developments I’ve seen in segmentation is a technique called Reverse Segmentation. It’s important enough that it deserves its own post, so stay tuned. For the moment, I’ll say that it provides a solution for criticism #1 above.

About Nico Peruzzi - Nico Peruzzi, PhD is a partner with Outsource Research Consulting, a provider of quantitative research and high-end analytics.

→ 2 CommentsTags:

Two Common Times NOT to Use Conjoint

July 28th, 2010 by Nico Peruzzi · essay

This information is useful for people who want to understand a couple times when it is not appropriate to use conjoint analysis.

Conjoint analysis is a gold standard technique for measuring feature preference, particularly in relationship to price.  I’m particularly fond of Adaptive Choice-Based Conjoint (but that’s a topic for another post).  There’s a lot of buzz around conjoint as a tool to help product managers choose features that will help their products better compete in the marketplace, so I often get calls from companies thinking it would be a good idea to do a conjoint project.

In this post, I’ll show two common times when it is NOT appropriate to use conjoint analysis.

Definitions:

An “attribute” is something like brand, number of licenses, amount of storage, color, package size, etc.  A “level” is the degree of an attribute.  For example, brand A, B or C; 5, 10, or 20 licenses; 1, 2 or 3 TB of storage; blue, red, or black color; 12 ounce, 18 ounce, or 24 ounce package.

When NOT to Use Conjoint:

1. Your product features are already locked in – you just want to test prices. If your product is fully baked, you don’t want to use conjoint.  Conjoint is all about looking at the inter-relationship between various levels of product attributes and price.

If your product is locked in as a 10 license product with 1 TB of storage and other features set, conjoint is not for you.

So, how can you test price on your fully baked product?

Note: each of these methods deserves its own post, but here’s a taste.

Monadic designs:

  • Break your respondent sample into groups that each see a single price associated with the product and ask their likelihood to purchase.  Plot the probabilities against the prices.

The  van Westendorp Price Sensitivity Meter:

  • Ask “too inexpensive”, “inexpensive”, “expensive”, and “too expensive” questions.  Plot the data to obtain lower and upper bands and optimal price point.

The Newton-Miller-Smith variant of van Westendorp:

  • Add purchase probability follow-up questions based on the inexpensive and expensive answers.  Build consideration curves.

2. Your attributes don’t vary (don’t have levels) – you’re just testing preference/importance of a number of items. You are not looking at the inter-relationship of various levels of brand, size, quality, durability, package, price, etc.  Instead, you want to understand the importance of, or preference for, a number of features/attributes that each have a single (constant, not varying) level.  Perhaps you want to test the general importance of brand vs size vs quality, etc.  Or, you may want to understand the importance of the specific, fixed features that make up your product (e.g., is having 10 licenses more important than having 1 TB of storage or the other features that make up your product?).

So, how can you test preference/importance of these features?

Note: A full description of MaxDiff can be found on the Outsource Research website.

Maximum Difference Scaling (MaxDiff)

  • Force respondents to make trade-offs between (usually) 4 of your items at a time.  They indicate which item is most and least preferred (important, etc.).  The output yields all the items on a 100-point scale, where you can truly say that a given item is “twice” as preferred as another item with half its value.

Note: MaxDiff can be used to help reduce the number of attributes that you carry forward into a conjoint.  For example, if your product has a lot of potential features to test, it would be wise to reduce the number that you bring into conjoint, so that the respondent is not overwhelmed.  MaxDiff can show you the most important attributes, which can then be further explored in the conjoint.

Conclusion:

Conjoint analysis is a powerful technique that can help you configure your feature-price mix to create a product that will be most preferred by your market.  However, if your feature set is already locked and you just want to test prices, or if your attributes don’t have any variation (levels) to them, then conjoint is not for you and you’ll need other techniques to solve your research problem.

About Nico Peruzzi - Nico Peruzzi, PhD is a partner with Outsource Research Consulting, a provider of quantitative research and high-end analytics.

→ 1 CommentTags:

Online Survey Sample Is Not Clean Enough – Clean it Yourself

July 8th, 2010 by Nico Peruzzi · essay

This information is useful for people who use panel sample for online surveys, and who want to make sure their survey data is truly clean.

Online Survey Panels Tell Us Their Panelists Are Clean

It’s hard to open a marketing magazine without seeing an ad from an online survey panel company proclaiming how clean and high quality their panel is.  A few years ago, this claim was a big deal – it was the Wild West of online survey panels, and buyers of sample had to be very careful as to who they worked with.  Today, however, most major online survey sample companies have adopted measures to get rid of professional respondents, prevent over-surveying, and make sure that respondents are who they say they are.  So, whether the sample is “true” or “pure”, or there’s “attention to detail”, most reputable panel companies are doing a decent job of giving those of us who field surveys a good product.

But Survey Data is Still Dirty

However, and here’s a big however, the data from most online surveys using panel sample still comes in with some dirty responses.  My research shows that between 1 and 5% of survey data from panel sample is garbage.  Garbage – throw it out; don’t bring it into your final dataset to analyze.  Sure, one can blame some of these dirty responses on frustrated respondents dealing with poor survey writing (bad questions, too long, etc.), but the fact remains that you had better clean that survey data before it goes in for analysis.

So, How Do I Clean the Data?

Here’s a plan you can use to clean your data.

When I say “flag” below, I mean that you create a new variable in your dataset next to the variable you are examining, and you place a “1” in a cell if the respondent’s case is flagged.

  1. Flag speeders. Look at time to completion and flag those respondents who took the survey in an unrealistically short time.  Check the median time to completion and establish rules that you feel comfortable with – I often flag those taking <1/3 of median time with a “1″ (“speeder”), and those taking < 1/4 of the median time with a “2″ (“super speeder”).  You might consider removing outliers (at the slow end) before calculating your median.
  2. Flag straightliners. If you having any grid/matrix questions, flag those respondents who gave the same response to every item (unless it makes sense that they could do so).
  3. Flag gibberish or garbage responses. If you have any open-ended responses, look for text such as “asdf” or “…..”; flag these responses, and any other “colorful, yet meaningless” responses you find.
  4. Flag incongruent combinations. If a respondent says their company size is 1000 and the number of PCs in the company is 5, something’s wrong here.  Flag it.
  5. Trap questions. Did you include any questions such as “Please choose the third response below”, or “Please type the word “attention” below”?  If you did, check them, and flag those respondents who didn’t follow the directions.
  6. Sum up your flags. Compute a new variable that sums all the flags.
  7. Sort your dataset by summed variable. Bring cases to the top that have suspicious answers on a number of your checks.
  8. Inspect and delete cases with flags. Delete those cases that are too “dirty” to be included.  Review with key stakeholders to agree on deletions.
  9. Notify your vendor of any bogus respondents. All the vendors I work with do not charge for any respondents I have flagged for deletion.  Show them the IDs of the respondents you threw out, and they’ll take action on their side to warn and/or remove these panelists from their database.

Following the steps above will insure that the data you analyze is as clean as possible.  Yes, it takes a bit of time, but the effort is clearly worth it when compared to making decisions based on the analysis of data that includes bogus responses.

One last note: if you really need your final sample size to hit a specific number, and you can’t go below that number, you can over-sample, in anticipation of throwing out some respondents.

Feel free to contact me for more details about some of the specific techniques I have found useful to clean data, or follow me on Twitter @NicoPeruzziPhD to hear about other marketing research topics.

About Nico Peruzzi - Nico Peruzzi, PhD is a partner with Outsource Research Consulting, a provider of quantitative research and high-end analytics.

→ 6 CommentsTags:

Partner Feedback Network :