Monday, 16 December 2013

Data Science or Data Scientist

Each and every company in the world is eyeing to leverage the use of data lying with them since ages and due to which there is lot of demand for the professionals who are into the data side of it.

Majority of the universities, institutes across the world have started offering the courses in the data science to accommodate the demand for data analytics professionals in the industry.

Here comes the tricky question as to how to identify the data science requirement in the organization and it is really mapped to the organization’s genesis. If an organization is a product development company then it is really the requirement for a data scientist kind of role and if the organization is a services or a solutions company then it is more of a data science role.

Let me explain the difference between these roles, Data Scientist is more of a person whose expertise is more into the technology (a “techie”), where the skill set would be to tackle the engineering problem with respect to the data. Data scientist view to the problem will generally be on how to automate or how to process the data efficiently within the time & quality constraints for the next task or how to summarize the data based upon the characteristics of the underlying data. Mostly, the primary objective of the data scientist is to create the scalable algorithms for the data problems and which will also be able to convert a commercial viable product out of it. Majority of the data scientists in the world are techie driven and as likely pointed by major tech groups including HBR is it is the sexiest job of the 20th century and I would extend saying if the data scientist has through knowledge of data science including data mining, statistical mining techniques then it would be more than the sexiest job of not just 20th century but for the whole life.

The other side of the coin is Data Science professional, who are not techie but has in-depth knowledge, exposure, experience on the purely Data Science part of it. Data science professionals’ skillset would always be finding a suitable approach/methodology for a data problem/data prediction using the various statistical and data mining techniques. The primary objective of the professional is to create experiments for predicting or optimizing in order to train an algorithm with the data and test the results with the validation/test datasets. Most of the time, these professionals spend majority of the time on getting the right problem statement and preparing the data for choosing the appropriate algorithm. Data Science professionals will have rich knowledge on the concepts/techniques and they use them with the experience to get the viable solution for the problem.

Majority of the tech companies, product companies are trying to replace the second side of the coin with the product. It is like given a business problem and the data, product should process all the steps required for data pre-processing, choosing the right algorithm, validating the results, coming up with the equation for prediction, etc. for the “business user”.

I feel it is a long way to go to reach to that level and it is always to have both the roles in the organization who can complement each other and achieve the best possible result


2 comments:

  1. Well differentiated!.If I may ask Can theData Scientist with out thorough knowledge in machine leaning and statistical modelling be called a Data Engineer?

    ReplyDelete
  2. I believe that with the technology advancement and the talent pool around, it won't be long before we see products built which automatically perform all the pre-processing steps, model building and validation!

    ReplyDelete