Data science is interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured,similar to data mining.
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.
Data science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.
Turing award winner Jim Gray imagined data science as a “fourth paradigm” of science (empirical, theoretical, computational and now data-driven) and asserted that “everything about science is changing because of the impact of information technology” and the data deluge.
In 2012, when Harvard Business Review called it “The Sexiest Job of the 21st Century”, the term “data science” became a buzzword. It is now often used interchangeably with earlier concepts like business analytics, business intelligence, predictive modeling, and Statistics. Even the suggestion that data science is sexy was a paraphrased reference to Dr. Hans Rosling’s 2011 BBC documentary quote, “Statistics, is now the sexiest subject around”. Nate Silver referred to data science as a sexed up term for statistics.
In many cases, earlier approaches and solutions are now simply rebranded as “data science” to be more attractive, which can cause the term to become dilute beyond usefulness. While many university programs now offer a data science degree, there exists no consensus on a definition or suitable curriculum contents. To its discredit, however, many data science and big data projects fail to deliver useful results, often as a result of poor management and utilization of resources.
In 1996, members of the International Federation of Classification Societies (IFCS) met in Kobe for their biennial conference. Here, for the first time, the term data science is included in the title of the conference (“Data Science, classification, and related methods”), after the term was introduced in a roundtable discussion by Chikio Hayashi.
The popularity of the term “data science” has exploded in business environments and academia, as indicated by a jump in job openings. However, many critical academics and journalists see no distinction between data science and statistics.
Writing in Forbes, Gil Press argues that data science is a buzzword without a clear definition and has simply replaced “business analytics” in contexts such as graduate degree programs.
In the question-and-answer section of his keynote address at the Joint Statistical Meetings of American Statistical Association, noted applied statistician Nate Silver said, “I think data-scientist is a sexed up term for a statistician….Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.
Everyone thinks of Azure as being a Microsoft product–focused cloud, but when you look at the long list of operating systems and networking appliances supported by Azure, many are Linux-based.
Likewise, NYU Stern’s Vasant Dhar, as do many other academic proponents of data science, argues more specifically in December 2013 that data science is different from the existing practice of data analysis across all disciplines, which focuses only on explaining data sets. Data science seeks actionable and consistent pattern for predictive uses. This practical engineering goal takes data science beyond traditional analytics. Now the data in those disciplines and applied fields that lacked solid theories, like health science and social science, could be sought and utilized to generate powerful predictive models.