Each and every company in the world is eyeing to leverage
the use of data lying with them since ages and due to which there is lot of
demand for the professionals who are into the data side of it.
Majority of the universities, institutes across the world
have started offering the courses in the data science to accommodate the demand
for data analytics professionals in the industry.
Here comes the tricky question as to how to identify the
data science requirement in the organization and it is really mapped to the
organization’s genesis. If an organization is a product development company
then it is really the requirement for a data scientist kind of role and if the
organization is a services or a solutions company then it is more of a data science
role.
Let me explain the difference between these roles, Data
Scientist is more of a person whose expertise is more into the technology (a “techie”),
where the skill set would be to tackle the engineering problem with respect to
the data. Data scientist view to the problem will generally be on how to
automate or how to process the data efficiently within the time & quality
constraints for the next task or how to summarize the data based upon the
characteristics of the underlying data. Mostly, the primary objective of the
data scientist is to create the scalable algorithms for the data problems and
which will also be able to convert a commercial viable product out of it.
Majority of the data scientists in the world are techie driven and as likely
pointed by major tech groups including HBR is it is the sexiest job of the 20th
century and I would extend saying if the data scientist has through knowledge
of data science including data mining, statistical mining techniques then it
would be more than the sexiest job of not just 20th century but for
the whole life.
The other side of the coin is Data Science professional, who
are not techie but has in-depth knowledge, exposure, experience on the purely Data
Science part of it. Data science professionals’ skillset would always be finding
a suitable approach/methodology for a data problem/data prediction using the
various statistical and data mining techniques. The primary objective of the
professional is to create experiments for predicting or optimizing in order to
train an algorithm with the data and test the results with the validation/test
datasets. Most of the time, these professionals spend majority of the time on
getting the right problem statement and preparing the data for choosing the
appropriate algorithm. Data Science professionals will have rich knowledge on
the concepts/techniques and they use them with the experience to get the viable
solution for the problem.
Majority of the tech companies, product companies are trying
to replace the second side of the coin with the product. It is like given a
business problem and the data, product should process all the steps required
for data pre-processing, choosing the right algorithm, validating the results,
coming up with the equation for prediction, etc. for the “business user”.
I feel it is a long way to go to reach to that level and it
is always to have both the roles in the organization who can complement each
other and achieve the best possible result
Well differentiated!.If I may ask Can theData Scientist with out thorough knowledge in machine leaning and statistical modelling be called a Data Engineer?
ReplyDeleteI believe that with the technology advancement and the talent pool around, it won't be long before we see products built which automatically perform all the pre-processing steps, model building and validation!
ReplyDelete