Data Scientists, This is the History Behind your Data Science Jobs

Published by on May 10, 2021

By Apoorva Komarraju

The history of data science throws light on the important developments that made the field possible today.

Ask a graduate about their first step in the tech world and data science is the term that echos. The interesting thing about data science is that the fundamental role of the job existed much before the term was coined. The history dates back to 1962 when researchers, statisticians, computer scientists had initial discussions about this field. 1962 seems like a long time ago, right? Sit back as you are going to read through the timeline of the evolution of data science and its applications.  

The History Of Data Science

1962: John Wilder Turkey, an American mathematician, widely known for the development of the Fast Fourier Transform algorithm and box plot, wrote in The Annals Of Mathematical Statistics journal some articles about data science titled “The Future Of Data Analysis”. As a mathematician, he talked about how his interest grew in data analysis and how the statistical component of data analysis should be characterized as a science rather than mathematics. The article described data analysis as an “empirical science”. 1974: Peter Naur, a Danish computer science pioneer and the recipient of the Turing award published a book as a survey of contemporary data processing methods with a wide range of applications. Named the Concise Survey of Computer Methods, Naur’s book was published in Sweden and the United States and talked about the concept of data according to the definition of IFIP Guide to Concepts and Terms in Data Processing. “Data is a representation of facts or ideas in a formalized manner capable of being communicated or manipulated by some process.” 1977: Let’s divide this year into two, the John W Turkey year and IASC year. 1977 saw Exploratory Data Analysis, a book published by Turkey arguing about the need for emphasis on using data to suggest hypotheses for necessary tests. As a section of the ISI, The International Association for Statistical Computing (IASC) established itself with an aim “to link traditional statistical methodology, modern computer technology, and the knowledge of domain experts in order to convert data into information and knowledge.” 1989: The first Knowledge Discovery In Databases workshop was organized and chaired by Gregory Piatetsky-Shapiro, which became the annual event on KDD in 1995. 1994: The September edition of the Business Week published a cover story on Database Marketing as a first. It read, “Companies are collecting mountains of information about you, crunching it to predict how likely you are to buy a product, and using that knowledge to craft a marketing message precisely calibrated to get you to do so.” It further added that when the world witnessed the concept of checkout scanners for the first time, the result was a collective disappointment as companies were too overwhelmed by the flood of data and didn’t know what to do with it. 1996: The trio Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth publish “From Data Mining to Knowledge Discovery in Databases”, which talked about the various names given to the process of finding useful patterns in data like data mining, knowledge extraction, information discovery, information harvesting, data archeology, data pattern processing, etc. They further added that according to the KDD, “the overall process of discovering useful knowledge from data and data mining refers to a specific step in the process. Data mining is the application of specific algorithms for extracting patterns from data with additional steps like data preparation, data selection, data cleaning, incorporation of appropriate prior knowledge, and proper interpretation of the results of mining, which are essential to ensure that useful knowledge is derived from the data.” The publishing also critiqued the “blind application” of these methods as they would result in the discovery of meaningless and invalid data patterns. 1997: Professor C.F. Jeff Wu, currently a faculty member at the Georgia Institute of Technology gave his inaugural lecture for the H.C. Carver Chair in Statistics at the University of Michigan. He called for statistics to be renamed as data science and statisticians to be renamed as data scientists. 1999: In a Journal for Knowledge@Wharton, Jacob Zahavi quoted, “Conventional statistical methods work well with small data sets. Today’s databases, however, can involve millions of rows and columns of data which makes scalability a huge issue in data mining.” Known as “Mining Data for Nuggets of Knowledge”, the journal also addressed another technical challenge that developing models that can do a better job at analyzing data, detecting non-linear relationships and interactions between elements, and special data mining tools should be developed to solve website decisions. 2001: In a plan to “enlarge the major areas of technical work of the field of statistics”, Willian S. Cleveland published “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” It talked about data science as a field in the context of computer science and the applications in data mining. 2002: April of that year saw the launch of Data Science Journal that published papers on “the management of data and databases in Science and Technology. The Journal contained descriptions of data systems, their publication on the internist, applications, and legal issues as “published by the Committee on Data for Science and Technology of the International Council for Science. 2005: The National Science Board published “Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century.” The report stated the need to develop the career path for data scientists and make sure that research enterprises have a sufficient amount of professional data scientists. The report further defined data scientists as “the information and computer scientists, database and software engineers and programmers, disciplinary experts, curators, and expert annotators, librarians, archivists, and others, who are crucial to the successful management of a digital data collection.” 2010: The mention of a “new kind of profession as the data scientist” emerges in a report written by Kenneth Cukier, The Economist. The role is defined as a professional who combines the skills of a software programmer, statistician, and storyteller/artist to extract the gold hidden under mountains of data. 2012: September 2012 was the time when Harvard Business Review published “Data scientist: The Sexiest Job of the 21st Century” written by Tom Davenport and D.J. Patil. The work that started in 1962 to recognize data analysis as a science first and then data science as a profession required in every enterprise, started taking shape in the early 2000s. After 59 years, we now know data science as a booming career option in the tech world. Not just research enterprises, data science is transforming every major industry and small businesses, and refining their business processes to dig out insightful information from floods of data which is more than ever. To read more about data science and its applications in the post-pandemic era, the one we’re living and surviving, click here.

Disclaimer: All the Crypto articles are contributed by third-party and does not have editorial involvement of Analytics Insight. Analytics Insight does not endorse/ subscribe to the contents of the article/advertisement and/or views expressed herein. Readers are advised that Cryptocurrency and related products and NFTs are unregulated and can be highly risky. There may be no regulatory recourse for any loss from such transactions/views expressed in the article. Analytics Insight shall not in any manner, be responsible and/or liable in any manner whatsoever for all that is stated in the article and/or also with regard to the views, opinions, announcements, declarations, affirmations etc., stated/featured in same. The decision to read hereinafter is purely a matter of choice and shall be construed as an express undertaking/guarantee in favour of Analytics Insight of being absolved from any/ all potential legal action, or enforceable claims. The content is for information and awareness purposes and does not constitute a financial advice.

View online

Leave a Reply