Sentiment Analysis Symposium – Live Blog

Sentiment Analysis Symposium
5:13pm

Carol Rozwell of Gartner moderated a panel of experts on innovation in sentiment analysis:

- Leslie Barrett of TheLadders
- Bing Liu of the University of Illinois at Chicago
- Romi Mahajan of Metavana

Barrett predicted that sentiment analysis would move more toward free flowing emotion detection.

Mahajan said the two major innovations are the organizational and the philosophical. Organizationally, we will make the vast amount of social data available to all in an organization rather than the traditional “cloistered priesthoods” who control information now. Philosophically, we will move away from serial and linear information toward a constant dialectic.

Liu gave an example of how companies are using social sentiment when considering acquisition targets. He also spoke of how his wife and daughter ask him how social data is useful to them – using the example of gathering information for a mattress purchase.

Mahajan said that even if part of what we are discussing comes to fruition it will be a major change.

An audience member stated that we have not discussed today who the people are whose sentiment we are measuring, and that we need to take this information into account.

Barrett said her organization does take this type of segmentation into acccount. Liu mentioned he has analyzed gender differences. Mahajan said that looking at causation is a tall order; rather, we should be satisfied with strong associations.

Liu mentioned that sentiment analysis can make it easier to process large amounts of information; for example, who wants to read every Amazon.com comment about a product?

An audience member asked about government use of sentiment analysis data. Mahajan mentioned that it would be useful for national security purposes. Another audience member who is a professor at George Washington University in Washington, DC said he is aware that the Defense Intelligence Agency makes use of some type of sentiment analysis.

Barrett said she believes technology will not so much put people out of business but rather create additive business opportunity. Mahajan added that changes in technology lead to the demise of some types of jobs and the rise of otehrs.

Mahajan said we would be wise to plan for how to deal with “blowback” by opponents of using sentiment analysis.

Barrett said each of us should go back to our organizations and look at our data and figure out at least one thing we will do differently based on what we have learned today.

4:29pm

Ronen Feldman of Hebrew University and Digital Trowel talked about his company’s solution called Visual Care. He stated the benefits include reduced development time and increased accuracy.

4:17pm

Kevin Cocco of SproutLoop talked about the way his company uses crowdsourcing to categorize data from the Twitter and Google API feeds. He compared the crowdsourced predictions to those of the Google Prediction API.

The Google Prediction API is ppor for batch predictions and model tuning; it is good for real-time data, sccaling and easy integration. The crowdsourcing model works well when human agree; however, in this experiment they agreed only 44% of the time. Cocco concludes tweet sentiment analysis can be confusing for humans.

4:06pm

Zack Kass of CrowdFlower said that sentiment analysis has been reduced to “PNN” (positive, negative, neutral), machines can’t even accurately define PNN, and crowdsourcing provides rich feedback. He presented an analysis of Radian6 data about former U.S. presidential candidate Herman Cain which was 27% accurate according to human audit. His company’s performance of the same task had 94% accuracy, he stated.

First they ask the human coder whether the piece of data is relevant. Then they rate the sentiment polarity relative to the target. Then they code options for finer gradations of meaning.

3:53pm

Max Yankelevich of CrowdControl discussed cognitive surplus and crowdsourcing. He spoke of ideas from the book Cognitive Surplus by Clay Chirky.

We have a lot of cognitive surplus, as evidenced by the following jusxtaposed facts:

- 200 billion hours per year spent watching TV by US adults
- 100 million hours to create Wikipedia

Crowdsourcing can be thought of as “crowd computing.”

Challenges with getting things done with cognitive surplus:

- complex tasks
- accuracy rate
- lack of attention
- lack of commitment

Yankelevich advocates combining artificial intellicenge (for the computer component) and crowdsourcing (for the human element).

3:19pm

Seth McGuire of Gnip said it’s not just about Twitter. Gnip aggregates and sells social media data across multiple platforms.

McGuire asked what is the right combination of data, that is, what is the right social cocktail?

There were two key dimensions they discovered:

- Reaction time – Twitter, Facebook and Google+ were faster, while WordPress, DISQUS and IntenseDebate were slower.

- Depth – Deeper platforms were YouTube, Tumblr and Flickr; more concise networks were Twitter, Facebook and Google+

Public Relations and supply chain professionals need faster, more concise data. Product development and brand mangagement professionals need deeper, not necesarily faster data.

3:13pm

Jeff Catlin of Lexalytics gave examples of tweets his company trained its sentiment engine to “solve” which are easier for humans to understand:

- Citigroup allows leniency for victims of foreclosure
- I loveeeeeee my evo
- I have an iPhone, but I am not really feeling very happy about my iPhone
- In my opinion right now, Apple is making money on a smart marketing strategy

Here are some they have not been able to solve:

- It was awesome – for the week that it worked.
- i thought i saw a previous for that on mtv movie awards which was a joke
- I don’t get why they call it the droid incredible
- That backflip was so sick

2:50pm

Frank Cotignola of Kraft Foods said sentiment analysis is not at all ingrained in market research. He asked whether we are truly listening. It is a cultural shift in how we interact with consumers. This is a difficult change.

A big mistake some make is just listen to what is being said about brands. However, people often do not talk about brands. The better approach is to listen to what consumers are saying; then see where brands fit into the conversation.

Common objections to sentiment analysis include:

- not representative
- missing demographics
- not my consumers
- too much to read
- no time
- not what I’m used to

A way to convince people of the utility of sentiment analysis is to give examples.

One example is the ability to predict questions around the economy. Typically we look at traditional economic measures. What if we used social media to assess the economy. What is the online sentiment about things like gas prices, unemployment and food prices. You can also look at search data – for example, searches on the term “unemployment” track the unemployment rate (presumably people are looking for information about benefits).

2:43pm

Sobhan Hota of Fidelity Investments discussed how his company uses sentiment analysis of their Voice of the Customer data for direct customer outreach, identifying influential customers, and customer retention. They do coding and analysis of data that identifies the top positive and the top negative words and phrases.

2:33pm

Ryan Sager of the Wall Street journal discussed their ongoing series of sentiment analysis data presented in their weekend newspaper under the title “Sentiment Tracker: A Computational Analysis of the Conversation on Social Networks.” He gave an example of an infographic they published analyzing the reaction on Twitter and Facebook to Tim Tebow becoming a member of the New York Jets football team.

2:24pm

David Nadeau of Media Miser discussed cross-lingual media sentiment analysis. Possible solutions to the cross-lingual challenge are: creating a system for each language or applying the same system after machine translation.

They did an experiment comparing:

- English sentiment analysis on French texts
- French sentiment analysis on French texts
- English sentiment analysis on machine translated French texts

The machine translation approach worked best. Further analysis showed that combining approaches worked best.

2:19pm

Michael Tupanjanin said his company Metavana has come up with a scientific breakthrough that will wipe the slate clean. It is a break from Natural Language Processing. They apply the principles of Chaos Theory to sentiment analysis. He said their algorithm has very high accuracy and is automated.

In the past sentiment analysis has been labor intensive, with low accuracy rates and heavy in professional services. There is now an historic business opportunity because of the explosion of the social web.

2:12pm

Andera Gadeib of Dialego AG talked about her company’s process of online ideation followed by classification. Gadeib starts with the divergent – creating ideas, followed by the convergent, adding layers of complexity to the analysis.

They created an ontology of areas addressed by sentiment analysis:

- emotional
- advertising
- person
- product
- action
- location
- brand
- time
- functional

The process of divergence includes concept testing, co-creation and crowdsourcing.

Measuring emotions is important for communications, product development and more. Gadeib gave a case study for a vacuum cleaner product; they found more emotion in this space than they expected. Blogs yielded more positive emotion and engagement; Twitter had more general content and skewed more negative.

They look at sentiment focus over time in a graphic they call the “long tail.”

12:05pm

Srini Bharadwaj of RAGE Frameworks talked about his company’s provision of its “Real Time Intelligence” product to enable a major financial institution to monitor of borrowers globally. He also gave a case study of the use of the same product by a pharmaceutical company to monitor drug safety and competitive activity.

11:55am

Catherine Van Zuylen of Attensity said the growth of social media has led to renewed interest in sentiment analysis. Sentiment analysis used to be more simple. But there is a change in what is meant by sentiment analysis.

Relative sentiment: “I bought an iPhone” is positive for Apple but negative for Apple’s competitors.

Also, negative sentiment is not always bad; Sarah Palin’s sentiment ratings were very negative when she hosted the Today Show; however, the the TV ratings for that show were very high.

Compound sentiment example using a tweet about the TV show Mad Men: “I love the show but hate the misleading episode trailers.”

Another trend is ambiguity and new uses for negative words. For example: “hate” is positive when used in the phrase “I hate to see her cry.”

It is also important to take emoticons into account; and emoticons vary culturally.

It is important for your team to be on the same page with respect to the definition of sentiment analysis and its specific operationalization.

11:44am

Banafsheh Ghassemi of the American Red Cross talked about her organization’s brand – which she described as one of the most recognized in the world. Its brand value is twice that of most American non-profits. They are also leaders in mobile text donations. They are also strong in social media. They have a partnership with Dell and Radian6 to track reaction to large-scale disasters in real time. They also have a growing mobile app presence with a focus on first aid and disaster response.

The number of charities has increased by 60% in the past decade. The Red Cross is a strong brand, but it still needs to win hearts, minds and dollars. Traditional advertising such as television spots is less effective than it used to be, and in this domain, the influence of friends and relatives is more influential. Influencers have big megaphones via social media, and they have the multiplier effect on their side. Ghassemi mentioned the recent Susan G. Komen controversy as a negative example of the effect of this multipler effect on a charitable organization.

The American Red Cross cares about the experience people have at touch points, change in sentiment, and executive visibility to systemic issues and investment prioritization.

It’s not just Twitter and Facebook. Yelp, for example, has user feedback on blood-giving touchpoints.

Advantages of analyzing social data are:

- real-time feedback
- it is a leading indicator
- competitive intelligence
- best practices

Opportunities with social data include:

- outreach (particularly youth and minorities)
- new policies
- product ideation
- process ideation

The death of the survey is overrated. Surveys give the American Red Cross lots of detailed feedback.

Beware of channel bias – different data sources tend to yield different flavors of data.

Be segment-appropriate. “Red Bull is not Red Cross.”

Beward of “Google Translate Syndrome” – sentiment platforms can lead to machine-applied incorrect information. In a recent disaster response, only 26% of positive comments were coded as positive in a sentiment platform (as compared to live coders).

Take a balanced approach, and do not lose sight of your traditional channels as you explore new ones.
11:15am

Chris Frank of American Express and Paul Mangone of Opnet Telecom are the authors of “Drinking from the Fire Hose.” They discussed their approach to online sentiment.

They apply the concept of the election “swing voter” to that of sentiment. Who are the people with neutral sentiment, who have the opportunity to move either in a positive or a negative direction. Which are the neutrals that lean in either a positive or a negative direction.

Frank and Mangone outlined a taxonomy of increasing involvement with a brand online. The steps, in order, are:

- Like it
- Know it
- Buy it
- Advocacy

Influence = power x platform

Power is derived in three ways:

- positional
- expert
- informational

They showed an “influence map” with platform (relevance, reach and amplitude) on the y axis and power (positional, expert, informational) on the x axis.

10:17am

Richard Brown of Thomson Reuters discussed his company’s provision of “news analytics” to financial markets in order to predict equity movements.

They provide 82 fields of data on all manner of financial news items, including:
- time stamp
- company identifier
- attribute
- type
- genre
- headline
- relevance
- sentiment
- degree of positive, negative or neutral
- first mention of the company
- topic codes

They are also coming out with something called Market Response Indicators which apply machine learning to determine which of the 82 fields are the most important at the stock, market and sector levels.

Thomson Reuters is now plugging in social media data. Markets are now more automated, and traditional analysts are doing what quants used to do.

You have to have big data in your business plan – forget it if you don’t.

Thomson Reuters is planning to take the problem of big data and turn it into an opportunity. They compare signals from internet news and social media output to signals generated from premium news (Reuters).

9:52am

Carol Haney of Toluna described text analysis as looking for the right needles in the haystack. She also noted that much of the data is negative in nature.

There is quite a lot of noise when selecting the data to analyze. It is important to gain an understanding of whether particular information is applicable.

Planning up front is important when embarking on an analysis. The steps are:

- plan your analysis
- harvest the data
- structure and understand the data
- validate the data with a quantitative survey

Haney noted it is important to weight to census rep and use a quality panel. Also, where to scrape depends on where you are in the world.

She presented a case study about Victoria’s Secret’s Dream Angels and Pink brands. Data were harvested from Facebook, Twitter and blogs. Clustering was used to identify and remove promotions. Then a classification scheme of brand and style was created.

Data were very domain specific. For example, the word “ass” less negative in this context because the product is underwear.

Haney also only looked at stronger setiment.

Issues identified from the analysis were thus:

- 2% said stores are not carrying the right size in swimwear
- 7% said Victoria’s Secret isn’t addressing the needs of women outside the 18-24 age group
- 1% said Victoria’s Secret merchandise is ugly

Haney then validated the comments about carrying the right size by conducting a survey of VS customers. Seventeen percent agreed about need to carry bigger sizes in store.

9:22am

Professor Jan Wiebe of the University of Pittsburgh described the process of “supervised machine learning,” as part of Natural Language Processing.

In this process there is a set of training data which is analyzed to create a learning algorithm. That algorithm is then applied to a set of data for which predictions are made for labeling the text sentiment.

Disadvantages to supervised machine learning are that it is expensive and time consuming to create training data expensive and time consuming.

Further, the meanings of words are domain dependent. Performance of machine learnning suffers when training and test data come frome different domains. Cross-domain sentiment analysis methods can help increase accuracy when analyzing data across domains.

Wiebe also discussed “sense level processing.” Senses are different meanings of words depending on context. Many words have multiple senses – for example, “interest,” “alarm” and “trust.” Further some senses of words are opinion-bearing while others are non-opinion bearing. When analyzing sentiment, non-opinion bearing senses are false hits. Data show that simply analyzing opinion polarity rather than each specific sense of a word can lead to higher accuracy.

Wiebe also described data acquisition, including the use of data annotators through the Amazon Mechanical Turk (AMT) service of Amazon.com. Data show that expert annotators perform better over time than those contracted through AMT.

“Active learning” is a process that can be used to reduce the amount of training data needed to train reliable systems. The most informative, least redundant data are analyzed first, then les efficient data are analyzed, and the analyst iterates through more data until satisfied.

 

Follow the Sentiment Analysis Symposium

Sentiment Analysis SymposiumResearch Access will be providing live coverage of the Sentiment Analysis Symposium this Tuesday, May 8th.

Check out Research Access on Tuesday for live updates.

Better yet, join me at the Symposium at Lighthouse International in Manhattan.  You can still register for $100 off using the code FOAF.

Here are some of the sessions I’m most looking forward to:

  • “Tween Pants Cut Too Low!! (or, Combine Survey Research & Social Monitoring to Discover the Unknown)” by Carol Haney, Toluna
  • “Emotional Versus Rational in Customer Decision Making” by Chris Frank, American Express, and Paul Magnone, Openet Telecom
  • “Real Time Intelligence Solutions,” by Srini Bharadwaj, RAGE Frameworks
  • “Political Sentiment Analysis,” by Dr. Stuart W. Shulman, Texifter
  • “Market Research Beyond Sentiment: Differentiating the Engaged and Pleased,” by Andera Gadeib, CEO, Dialego AG
  • “Sentiment As A Service,” by Michael Tupanjanin, Metavana
  • “Capturing Sentiment via Customer Intelligence,” by Sobhan Hota, Fidelity Investments
  • “’How Can I Listen If I’m Talking?’: The Power Of Social Media Listening,” by Frank Cotignola, Kraft Foods

I hope to see you in New York or online on Tuesday!

A Discussion of Text Analytics with Michael Tupanjanin

Michael TupanjaninWhat follows is the next in a series of interviews I conducted at the Net Promoter Conference in San Francisco last month.  If you missed my video interview with Dr. Ming Duong-van, you’re going to want to click over for a listen to his fascinating interview.  Still to come is an interview with Satmetrix CEO Richard Owen.  This interview is with Michael Tupanjanin, the CEO of Metavana.  The interview was conducted in the morning on the day Metavana and Satmetrix announced a partnership to create a social Net Promoter Score called the SparkScore.

Dana Stanley: We’re here at the Net Promoter Conference at San Francisco with Michael Tupanjanin, CEO of Metavana, as well as the company’s CMO, Romi Mahajan.

Michael, why don’t you go ahead and tell people who aren’t familiar with Metavana a little bit about your company.

Michael Tupanjanin: Sure, so Metavana was started about three and a half years ago by a guy named Ming Duong-van.  Dr. Ming is very well known in the academic circles primarily as a physicist. He was actually the co-founder of chaos theory. And he’s spent a lot of time studying the text analytics market and has, I think, done some incredible breakthroughs, scientific breakthroughs, specifically the algorithms that he’s written for Metavana that really take a look at text, specifically in the social web, and really uncover the true meaning and opinions that people have on the social web.

Dana Stanley: So when you’re talking about the social web and text, give me a practical sense of what type of data your software’s analyzing.

Michael Tupanjanin:  Well, I think just about every piece of text as far as I know is unstructured on the social web, which can be incredibly chaotic. So if you think about the correlation of people that have studied chaos theory and the clusters of galaxies, you’re actually able to apply that scientific principle to the social web, where the conversations are unstructured, the sentence and the grammatical structures are completely wacky, and the content itself is very unstructured. Being able to actually get meaning out of the second structures is a very difficult thing to do.

Dana Stanley:  What are some examples of how folks are using the Metavana technology to gain insights?

Michael Tupanjanin: We have a couple customers, like Marriott, they have a customer service group that spends a lot of time looking at the social web analyzing things like the basic things, what was your stay like at our hotel? Were the beds OK? Were the towels OK? Was the room service OK? And they’re always analyzing those pieces of information to see how they could improve their service.

We have another company that’s using our technology for smartphones. So right now, the smartphone market is incredibly competitive. We have a clear leader in iPhone, and they’re trying to figure out what their competitive advantage is. What kind of things can they put into their product to make them better? They’re also looking at customer service issues.

Dana Stanley: What do you say to people who throw out the idea that not all sentiment is on the web, that the people who participate in the web, that’s just a segment of all that sentiment that people need to pay attention to.

Michael Tupanjanin: That’s a good question. I’m a neophyte in market research. But here’s my impression. Market research is actually somewhat limited in terms of the sample size, right? You send out a survey to a bunch of people, but the sample size of the social web’s a lot larger than the sample size that you send out to people through your surveys themselves. And I think there’s also a predisposition amongst people that actually are willing to fill out a survey, as opposed to people that are just expressing their opinions on the web where it’s a little less stilted, and you actually probably get more meaningful information back.

Romi Mahajan: Dana, can I just pop in on that?

Dana Stanley: Absolutely.

Romi Mahajan: I think it’s a very prescient questions about how big, how complete is your set, right? And clearly, the social web is not everything, but there are 845 million people on Facebook. There are 250 million, bordering on now 270 million tweets a day. And each of these expresses something. Now, not all of them express sentiment, but a lot do. I think where normal, canonical market research needs to grow and evolve is in the notion of active data collection versus passive data collection, where what people are expressing on the social web is– they’re expressing it while in the context, their natural context.

They’re not being prompted. And so you get a different set of data, right? You get maybe a more natural set, a more authentic set, but a different set. In reality, when you put these two sets together, you get the truth. But the fact that structured data is easier to come by and unstructured data is harder to decipher, that’s what gives a company like Metavana room to maneuver.

Dana Stanley: Where do you think companies are in terms of their approach to this? Are companies diving into sentiment analysis? Are they wary? How would you assess that?

Michael Tupanjanin: I think that the market, in general, is incredibly interested. And I’ll take it to a higher level called text analytics as opposed to sentiment analysis.

Dana Stanley: Sure.

Michael Tupanjanin: I think the market’s incredibly confused. I think the market’s incredibly chaotic right now. There are lots of solutions that are available in the market. And I think a lot of the solutions are incredibly complex to actually do implementations to. So traditionally, a lot of those sentiment analysis or text analytics seem to reside with the knowledge management people inside major companies. And I think there’s a huge opportunity to actually now take it out to the masses, to the functional leaders, the sales leaders, the marketing leaders, the product management leaders, the research leaders, where they really haven’t had access to this kind of technology before.

I think there’s a lot of latent demand for it, but there’s also a confusion because I think so many different companies are approaching it in so many different ways. And I think traditionally the accuracy levels have not been that great. So I think there’s a little bit of skepticism, too.

Dana Stanley: So help me understand Metavana’s unique approach.

Michael Tupanjanin: So without getting into a long, scientific explanation – what it all comes down to is the algorithms that you write and how accurate they are and the principles that you apply. Traditionally, there’s been two approaches to what we’ll call text analytics. There’s been the natural language processing approach and then the more machine-learning approach.

The natural language processing, tends to be a very highly curated approach, like almost a lot of human intervention actually looking at grammatical structures and trying to develop taxonomies to be able to pull out the meaning, versus the statistical approach, which is much more automated and based specifically on algorithms themselves. Traditionally, people have felt that the statistical approach is less accurate, that the natural languaging process approach is more accurate.

However, the natural languaging approach tends to be not scalable because you have to spend a lot of time going through taxonomies versus having a more statistical approach, which is much more scalable. We tend to be more towards the statistical end, but the algorithms that we have written have taken accuracy to a whole new level, up to over 95%.

Romi Mahajan: Dana, it’s a great question. I think Michael answered it correctly on the scientific side. When we think about our business in general, right, we think about three core principles around why we think we’re unique. One is clearly accuracy, right? So whereas the industry is offering scarcely better than a coin toss accuracy, we’re offering one standard deviation away from perfect, so 95%, 96%. The second is what we call accessibility. We don’t believe that customer satisfaction understanding the social web should be sequestered or siloed someplace in the CSAT division of a company. It’s really for everyone.

So we’re building a system that allows any one in the corporation to be able to take– to interpret the social web. Accessibility is the next thing, true enterprise scale. And the third thing is scalability. We believe that our business model is going to offer the ability for anyone, regardless of price point, regardless of degree to which they believe in the social web or not, to access the social web. So those three principles we think make us unique.

Dana Stanley: That’s great. One thing that stood out, you mentioned the accuracy level. I’m just curious, how do you measure accuracy, or how do you self-evaluate as your algorithms presumably evolve?

Michael Tupanjanin: Yeah, we actually have to do it the old fashioned way. We literally will take– we recently did about 3,000 quotes that we actually rated, and we sat down with a bunch of high school kids and actually had them go through sentence by sentence by sentence and see, how would you score this sentence? And how did the machine score the sentence?

Dana Stanley: So you’re basically giving them homework?

Michael Tupanjanin: Absolutely.  There is no other way to do it because you can either do it some kind of automated way, which again, people question whether or not that’s the right way to do it.

Romi Mahajan: The thing is, once you go through the high school exercise, then the system learns on its own. But you have to go through the initial validation period to make sure that if someone leaves Starbucks and says, man, that Americana was awesome, that somebody’s verifying that that’s a positive comment.

Dana Stanley: Yeah, and how do you account for evolving language, and Urban Dictionary entries, and the fluid nature of language?

Michael Tupanjanin: Yeah, so the way the process is set up, we actually– one of our unique things is that we actually do things on a domain by domain basis. So we, for example, we’ll start with smartphones as a category. We’ll start with printers as a category, hotels, or airlines. And each of those domains has their own specific language in them. And one of the things that we do is the engine goes out actually crawls and trains itself on the language of that particular domain. So that’s one of the reasons that we get such high accuracy rates.

But the reality, as you said, is that language continues to evolve. And new words of slang appear all the time. So we found that we have to at least have the engine retrain itself every quarter. And it’s not a manual process. It’s literally simply going out and crawling the same data sources and doing almost like a QA process on the data sources for about a week, and then it’s updated itself on the slang. What it also does is it updates itself on categories. So what the engine does when it goes out and crawls, versus having a taxonomy that’s kind of predetermined, it actually will develop its own taxonomy based on organically what seems to be the right category.

So, for example, we crawled the airline industry, and lo and behold, the categories that came up were seating, crew, entertainment, waiting lines at the airport, baggage handling, all the things you would suspect. But at some point, there could be other categories that emerge.  For example, security, gate security, and stuff like that seems to be starting to percolate on the social web could become a category, too. So that’s part of the engine’s updating process.

Dana Stanley:  Do you sometimes get into arcane industries where maybe the client would have particular language that your incorporating as you go along?

Michael Tupanjanin: Some industries are more difficult than others. We’ve actually looked at, for example, one of our customers is a coffee machine manufacturer. And that’s a fairly simple, straightforward thing versus pharmaceuticals, where you start to get into some pretty arcane language around drugs and therapies, and that’s a lot more difficult. So I don’t know if we have all the answers for you. We’re looking at– pharmaceuticals, I think, will be a little bit of a tougher industry for us.

Dana Stanley: Interesting. And is it just English at this point?

Michael Tupanjanin: English, yes. We’ve done, now, tests in both Chinese and French. And interestingly enough, it’s taken about a day.

Dana Stanley: Wow.

Michael Tupanjanin: Yeah.

Dana Stanley: It took me longer than that to learn French.

Michael Tupanjanin: Well, what’s interesting about the technology, it’s not based on grammatical structure. It just needs to have a translation of all the words themselves, and then it can go out and train itself. So again, it’s a little bit different approach.

Dana Stanley: Interesting. So I have to ask, we’re here at the Net Promoter Conference, and by the time this interview is out, your release will have hit the wires. So tell me about this exciting initiative that you have going with Satmetrix.

Michael Tupanjanin: Well, from our perspective, it’s amazing on a couple of different levels. First, Satmetrix is clearly the leader in Net Promoter. They wrote the book on it. And they have established a very clear set of activities and workflows for people to actually improve their net Promoter Scores. So they are the methodological geniuses and also the workflow geniuses for helping companies improve their Net Promoter Score. And they’ve tied that directly to revenues, which is also a really, really good thing.

I think, from our perspective, being able to provide people a Net Promoter Score like a stock ticker, real-time, is huge. The old model has been you get your survey results back. You work on them and see how you improve over the next quarter. Now, you have an opportunity to actually see how you’re improving every 10 minutes if you need to, which is a huge breakthrough. And this is not an easy thing to do or replicate. From our perspective as a text analytics company, the fact that we have such high accuracy rates and the fact that our machine is flexible enough to actually take somebody else’s methodology and apply that to the social web is huge. There are very few people who can actually do that.

So from our perspective, it’s great. It also makes the information a lot more actionable. One of the things that I think the industry suffers from is that people sit there and say, yes, this sentence is positive. This is negative. Baggage handling was poor in this airport. What are we going to do? Who’s going to get that information, and what are they going to do with it? Being able to tie that to some kind of a standardized score for a company, I think, is a really big deal.

Romi Mahajan: So Dana, in about 45 minutes from this interview, but of course before this interview is published, there’ll be a piece of press on the wire around what we christened the SparkScore, which is a social NPS gauge. And it’s taking the notion of NPS, which is an industry-proven powerful methodology for loyalty and profit driving and completing the picture. The panorama is now complete. It used to be about structured, episodic, survey-based loyalty. And now it’s about the constant here-and-now social web loyalty. So we believe it’s a huge breakthrough for the industry, and Metavana’s very happy to power the SparkScore with, of course, Satmetrix, being the methodology and software provider.

Dana Stanley: So if I’m a customer who’s accustomed to using a Net Promoter Score, what will change for me?

Romi Mahajan: So I think your world gets better, slightly more complex but better, because we’re not saying don’t do normal Net Promoter. There’s a certain value in getting episodic structured data, longitudinally and otherwise. There’s also a certain value in understanding what’s being said anyway, unprompted, every day, 24/7, 365 worldwide. And so when you munge the two, you actually look at your business 360 degrees, as opposed to just seeing one fraction of not only the expression but also the ways in which customers express how they feel.

Dana Stanley:  That’s great, very exciting. So for the traditional, for lack of a better word, market research community, what should they take from this announcement?

Romi Mahajan: Let me break it into two categories. One’s smaller, and one’s bigger. So if the market research people who are familiar with, espouse, or follow NPS, clearly this is going to be a breakthrough, because it’s taking a very proven, powerful methodology and making it 21st century. It’s NPS 2.0. So for the NPS followers, it’s huge.

For the non-NPS followers, we’re all familiar enough with market research to know that it’s grappling with the abundance of data and the abundance of content and the burgeoning importance of the social web. And this allows them to start getting data and data feeds from the social web to use in anything, predictive analytics, reports, analysis of any sort. And so we believe that market research is an incredibly important part of the organization and of the industry.

But we also believe that it’s extremely limited by the technology. And now, we’re opening new business for them. So it’s about reinventing the industry and reinventing ourselves as market researchers.

Dana Stanley: Great. And if people want to learn more about the SparkScore, what should they do?

There’s a couple different things they can do if they’d like to learn more about the SparkScore. One is they can go to metavana.com. Then for second, go to satmetrix.com. Those are the best places to learn about the SparkScore. We will very shortly we will very shortly have a website called spark-score.com, very shortly, so not yet, in which people can play around with this and enter stuff in, and see what their score is.

Michael Tupanjanin: It’s interesting because it almost becomes, in a way, like the Klout score for companies, right? So we’re actually going to be posting a website that actually lists out, front and center, what people’s SparkScore is.

Dana Stanley: Interesting.

Michael Tupanjanin:  So anybody has access to it, whether it’s the companies themselves, customers, they’ll be able to go in, look at their Spark Score. We’re starting by rolling out five industries right now. But we think it’ll actually be very much like a corporate Klout score.

Romi Mahajan: Dana, under your tutelage, one day we hope that Research Access has an sNPS ticker running across it, so every company can come up and say, how are we doing?

Dana Stanley: So almost like a stock ticker concept?

Michael Tupanjanin: It is absolutely a stock ticker concept.

Dana Stanley: Very cool. Well, Michael, Romi, thank you for your time today.

Michael Tupanjanin: Appreciate it.

Romi Mahajan: Dana, our pleasure.

Satmetrix Reveals Social Net Promoter Score

The big news from the opening day of the Net Promoter Conference yesterday was Satmetrix‘ partnership with Metavana to offer a new “social” Net Promoter Score called the SparkScore.

Net Promoter is an approach to customer satisfaction measurement.  The new SparkScore is Satmetrix’ approach to the growing field of text analytics and sentiment analysis.

The announcement seems to have gotten significant attention so far, including a nice writeup from Mashable, which likened the SparkScore a “Klout score for brands.”

Coming soon on Research Access will be my interview with Satmetrix CEO Richard Owen, and conversations with Metavana’s CEO Michael Tupanjanin and the brains behind the SparkScore, Metavana founder and renowned physicist Dr. Minh Duong-van.

Photo Credit

Sentiment Analysis Firm Metavana’s New CMO, Romi Mahajan: An Interview

Romi Mahajan

Romi Mahajan

Romi Mahajan is a well-known technology marketing speaker and expert; he serves on a variety of advisory boards and speaks at over a dozen industry events per year.  He most recently served as the Worldwide Director of Sales and Strategy for the Digital Marketing and Search team at Microsoft.  Prior to Microsoft, Romi was founder of the KKM Group and served as CMO of Ascentium Group.  
Romi is also one of the founders of Research Access, and he has been a vital contributor to the online conversation about marketing and research.

Dana Stanley: Congratulations on your new position as CMO of Metavana. You’ve been a regular contributor to Research Access in the past, and I hope your new responsiblities will allow time for some continued guest posting!

For those who might not be familiar with Metavana, could you please take a moment to explain what the company does?

Romi Mahajan: Dana, writing for Research Access has been such a joy that I hope you allow me to continue to offer an opinion here and there!

I’m excited about my new role as CMO of Metavana, precisely because I believe we can make a real dent in reality with this company and really serve customers and the industry.

Metavana is at its essence a sentiment engine. What this means is that our engine can parse and make meaning of the geometrically-growing and unstructured/emotional content on the Social Web. We want to help redefine the Voice of the Customer and move it into the mainstream of business planning.

DS:  There are increasingly more companies these days in what’s come to be called the field of Sentiment Analysis. I understand Metavana has some unique ways of analyzing, scoring and packaging web sentiment. Could you give us a sense of the scientific principles used in Metavana analysis?

RM: Metavana attempts to solve a tough problem that is governed by the following 4 connected notions:

  1. The Social Web is truly the “Big Data” Web. There are 250 million tweets a day and 800,000 Facebook posts an hour.  And so on…
  2. The content on the Social Web is unstructured, asyntactic, often ungrammatical, and emotion — versus ordered, clear, factual data. It’s chaos; the tower of Babel writ large.
  3. The Social Web is always-on- 24/7/365 and is worldwide.
  4. Despite all of this, the Social Web reveals important truths about each of our brands….

We believe this is not a smart engineering problem but is really a physics and non-linear math problem. That is what we’ve based our algorithms on.

DS: It wasn’t that long ago that market research was considered to be surveys and focus groups – and that’s all. How do you think Sentiment Analysis – and Metavana in particular – fits into the overall market research picture?

RM: Look, market research has its place in the world but has not quite risen to the challenge of the Internet and Social Age. Market research tends to be episodic, one-off, and sequestered in the organization. Further, market research often lacks timeliness and context. Sentiment analysis (and Metavana by extension) change this by helping organizations understand the Voice of the Customer in real-time and in the real context (emotional, etc.) — that is what will define the next phase in the market research evolution. I believe in market research and want it to shine and have its rightful place.

DS: How will you use analytics and research in your role as CMO of an exciting internet company?

RM: Beautiful question, and I won’t sugar-coat. We’ll analytics and research in the company to determine market sizing, the “nature of the beast” we are trying to slay, and what customers and partners feel and think. But in a startup you go with gut often and you hope that decades of collective wisdom are brought to bear to do the right thing!

The problem with automated sentiment analysis

Sentiment analysis is a broad area of natural language processing, computational linguistics and text mining and it aims to determine the attitude of a speaker or a writer with respect to some topic.

Companies are using sentiment analysis to monitor the social web to understand and react to what their customers are saying about their brand. This post looks at the failings of social media monitoring tools and why sentiment analysis cannot be trusted to accurately reflect and report on the sentiment of conversations online. “This real failing of automated sentiment analysis can cause real problems for brands, especially if they are basing any internal workflow or processes on the basis of your social media monitoring.”