A data scientist is someone who processes large amounts of data from various (whether or not structured) sources (or ‘big data’) to convert this into useful information for the management of a company. The function resembles that of a data analyst, with the difference that the latter is mainly concerned with the processing of historical data into information, where the data scientist focuses on making predictions by extrapolating data and building models. For this the data scientist must have a thorough knowledge of IT, including the necessary programming skills.

What does a data scientist do?

Data scientists generate valuable insights from a large amount of structured or unstructured data, also known as ‘big data’. The data scientist does this by using special tools and software and writing so-called algorithms to unlock data, structures and analyze. Within certain functions, a data scientist can also be involved in thoroughly analyzing the business operations in order to map out the needs of the company. The activities of a data scientist may include the following:

Search for opportunities and possibilities in a considerable amount of data

Clean up and process raw data into usable data

Correlate different types of data

Detecting irregularities in data

Identifying questions from the operational management

Recognizing patterns in ‘big data’, to develop a predictive algorithm (self-learning mathematical model) – also called ‘machine learning’

Analyzing and selecting the best statistical methodology (eg regression, cluster analysis, decision trees) for solving questions from the operational management

Use of predictive algorithms for decision making

Describe and visualize discovered insights and predictions

Reporting and sharing generated insights and knowledge with management and other stakeholders

Investigate and develop possibilities to get even more out of existing data

Managing and optimizing data streams and data analyzes

Drafting, managing and / or training a data science team

Where do data scientists work?

Data scientists can work in all (larger) companies that have to deal with big data and have the need to fathom this large amount of data and use it to make predictions for internal operations. This could for example be banks, publishers, IT companies or multinationals in the area of ​​production or retail. A data scientist can also work as a consultant for service consultancy organizations that help external clients to create insights and predictions based on big data.

Because the work of a data scientist has a lot in common with business operations, data scientists can have to deal with managers, project leaders, econometricians, business intelligence consultants, developers and data analysts during their work.

How do you become a data scientist?

In order to become a data scientist, in addition to extensive work experience in the field of data (analysis), a background in exact science is often required. Knowledge and skills are also required in areas such as:

  • Data science tools such as R, SQL, SAS, MatLab
  • One or more programming languages ​​(mostly Python)
  • Large databases and / or data models (Hadoop, Spark, SQL)
  • Simulations, scenario analyzes and algorithms (models)

