Data Science

definition, data scientist activities and examples


Data Science is an interdisciplinary science that deals with the extraction of knowledge from data. Data science generates information from large amounts of data in order to derive recommendations for corporate management. The aim of these recommendations for action is to improve the quality of business decisions and the efficiency of work processes.

Data science definition

The term “data science” was first used in 1960 by the Danish computer science pioneer Peter Naur as a synonym for “computer science”. In 2001, the American computer scientist William S. Cleveland founded data science as an independent discipline. Cleveland named as areas of expertise of data science: data models and data methodology, arithmetic operations with data, evaluation of tools, pedagogy, theory and multidisciplinary research.

Today, the data science definition is much broader than it was fifty years ago: data science is at the intersection between the scientific areas of mathematics (especially stochastics) and computer science, as well as industry-specific specialist knowledge.

Data Science is the intersection between the fields of mathematics, computer science and domain-specific business know-how. People who work in the data science department are referred to as data scientists.

Data scientist activities

In 2005, the US government's national research community highlighted the vital importance of data scientists in managing digital data. The data scientists included in particular computer scientists, programmers, database experts, domain specialists, archivists and librarians and even specialists in software development.

Data scientist expertise

A data scientist must have a wide range of specialist knowledge in order to be able to perform his analytical, advisory and coordinating tasks at the interface between databases, employees and management of a company:

  • Mathematical and stochastic models and methods form the basis for the analysis and interpretation of data. In particular, high data quality and the careful selection of the characteristics of data samples are of great importance. Machine learning methods are also used in particular.
  • In the age of large amounts of data (“Big Data”), data science is hardly conceivable without a sound knowledge of data processing and its instruments. Computer science (informatics) supports big data analysis with algorithms that are suitable for efficient processing of the respective data types. Analysis tools select the required valuable information from structured or unstructured data volumes.
  • Domain-specific business know-how is essential in order to be able to classify and analyze data and business processes of a particular company and to develop meaningful recommendations for action. Business knowledge is essential when performing data analysis for business optimization.

Personal qualifications of a data scientist

The practice of data scientist activities also requires some particularly pronounced personal qualifications:

  • Communicative skills, on the one hand to understand industry experts and on the other hand to be able to convincingly convey the recommendations for action derived from the data analysis to employees of all company areas and levels,
  • Creativity in the flexible adaptation of analytical methods to new challenges and contexts,
  • Open-mindedness in researching and using new analysis instruments and data science methods as well
  • Ability to coordinate both with regard to the delegation of tasks for data acquisition and the management and control of data science projects.

Industry examples for the use of data science

Today, large amounts of data are evaluated using the tools and methods of data science for companies in all industries.

  • Retail and retail companies benefit from data science by analyzing the purchasing behavior of customers. Investigating possible causes of returns will help reduce the number of returns.
  • In the health sector (medicine and pharmacy), data science enables the creation of similarity analyzes as a basis for an individualized treatment of patients and the optimization of medication.
  • Logistics companies use data science to improve their work processes and the quality of their transport services.
  • Industrial companies control and optimize production processes using data science.
  • With the help of data science, insurance companies and banks exploit the potential of the external and internal data available to them in order to improve their products and increase sales success.

In the labor market, the demand for data scientists currently exceeds the number of trained data scientists available. Although there are increasingly extensive databases around the world, they cannot be fully analyzed due to the insufficient availability of data scientists.

Data Science courses

In line with the considerable importance of data in the digital age, a number of technical colleges and universities in Germany offer data science courses. Bachelor of Science (B. Sc.) Or Bachelor of Engineering (B. Eng.) As well as Master of Science (M. Sc.) Or Master of Engineering (M. Eng.) Are possible degrees.

Since the job description of the data scientist has developed in accordance with the continuously changing requirements of the economy, the data science courses are characterized by their high practical relevance.

  • For a bachelor's degree in data science, a standard course duration of six semesters is usually provided. The study modules mostly include theoretical computer science, statistics, stochastics, information security, algorithms and data structures as well as software development and programming.
  • Any subsequent Master’s degree can usually be completed in a standard period of four semesters. A master’s degree qualifies you for a possible doctorate in the fields of computer science, mathematics or natural sciences.

What does a data scientist earn?

The salaries for a data scientist have risen sharply in recent years, thanks to high demand. The starting salary of a data scientist is over € 50,000 per year. Depending on the job (consulting, start-up or corporation), the annual increase in salary in the first few years is around 5-10%.

According to Glassdoor.de, the average salary of a data scientist is around € 61,154, for a senior it's around € 78,000. A team lead is paid significantly more, up to € 100,000.

More precise figures are available for America. It can be assumed here that the values for Europe are slightly below, but certainly have a strong upward trend.

Data Science Salaries in the US (Source: datajobs.com):

JobSalary
Data Analyst - Entry Level$50,000 - $70,000
Data Analyst - Learn$65,000- $110,000
Data Scientist$85,000 - $170,000
Analytics Manager - 1-3 Direct Reports$90,000 - $140,000
Analytics Manager - 4-9 Direct Reports$130,000 - $175,000
Analytics Manager - 10+ Direct Reports$160,000- $240,000

Conclusion

Data science is an interdisciplinary approach and the intersection of mathematics, computer science and industry-specific specialist knowledge. For this reason, the requirements for a data scientist are high. In addition to mathematical and stochastic skills, a data scientist also needs skills in software development and specific industry knowledge. These requirements make it challenging to find the right employees.

The potential of data science is enormous, because data science can be used in almost every corporate area and industry to optimize processes. The right data and resources are required for this.