With the advent of the web and increased business process automation, companies are increasingly finding themselves inundated with vast troves of data. Making sense of and extracting insights from this data is a job that requires a wide range of skills – from the technical to the inferential – which are seldom found together in a single professional. This is the job of the Data Scientist – a job that most companies of significant size will need to fill in the coming years.
The following is a very insightful exchange about the Data Scientist role from a recent webinar sponsored by Research Access and GreenBook on the subject of Big Data. Four expert panelists contributed to the discussion:
- Steve Cohen of In4mation Insights
- Romi Mahajan of Metavana
- Lenny Murphy from GreenBook
- Charlie Wardell of Decooda
I hope you enjoy this interesting and insightful exchange as much as I did.
Steve Cohen: Just as “Big Data” has been thrown around, there’s a new term thrown around called the “data scientist.” So the question is I’ve seen it kinda pop up in a lot of blogs and discussions and so on about what a data scientist is. Again, the best definition of a data scientist I’ve seen, it’s somebody who has really three skills. The three skills are statistical skills – that is, you can model data and extract value from a model. The second one is somebody who has hacking skills. I don’t mean hacking in the bad sense of the word but hacking in terms of somebody who could be presented with a data set and, no matter what form it’s in, can pull it into a form that you can do modeling on. But the third and most important part is somebody who understands the subject matter and can extract the value from the information. That’s a very rare person. So if there are any people listening right now who have those three skills, you’re gonna be quite valuable over the next 10 to 20 years or so because there aren’t many of you out there; trust me.
Romi Mahajan: Update your resume and get on LinkedIn if you want a job.
Steve Cohen: Don’t worry.
Lenny Murphy: We recently predicted that there’s a gap of something like 100,000 data scientists needed. I’m not sure if they used that definition that you used, Steve, but I think it’s a good one. And it’s a real issue. Let’s talk about that for a second right now. I think as technology is evolving and the vision of being able to deliver Big Data from an etiological standpoint is coming to fruition, there’s a human capital gap that I think may become fairly critical in pretty short order.
What the hell do you do with this? What does it really mean? How do we extract real value and deliver real action, as Charlie said, from this? That’s going to require a real retooling both from an educational standpoint, from an HR standpoint, from a policy standpoint in how organizations look at and identify the whole structure to support this type of model.
Steve Cohen: Let me react to that quickly. I just wanna say I’ve already had two business school professors I know contact me after I wrote that information you just talked about, Lenny, who both said to me, please talk to us about how should we be retooling our undergraduate and graduate programs in order to meet the needs of data scientists in the future. So it’s starting to get on the radar screen of the universities.
Lenny Murphy: That’s good, because they don’t do well with market research. So it’s glad they’re coming on board with that. They’re still teaching stuff from 50 years ago on the MR industry. I’m really glad to hear that, Steve.
Charlie Wardell: Lenny, in the Big Data space and in the opportunity that we have right now with the data scientists, let’s not forget that it’s a slightly different space than the Business Intelligence of old. If you look at the data warehouses just over the last 10 or 15 years, I think it’s a real scary word to a lot of CIOs, because 50% to 70% of the data warehouses or enterprise data warehouse initiatives fail. When you look at the reasons for it, it’s obvious why, but no one’s really cracked the code in what makes a successful data warehouse.
Now we’re amping up the problem domain. We’re dealing with enormous amounts of data, and we need to process it very quickly, but the insights that are gleaning is really at the scientific level. These social scientists and cognitive psychologists are looking at the data, these big volumes of data, and then the relational side of the data. So, yeah, it’s a huge market for these types of data scientists, but we just need to make sure that it doesn’t become the next Business Intelligence of Big Data. We’ve really got to drive usable insights. One of the real reasons why the data warehouse fell is lack of adoption. Nobody trusts the data. Nobody’s using the platform. Nobody’s using the insight. So not only do we have to glean the insights, we have to convince other people that the insights that we’ve gleaned are real and accurate and actionable.