Four steps for getting started with data science

18 August 2020 4 min. read
More news on

Amid the current period of uncertainty, data science can serve as something of a golden compass to businesses, using statistics and artificial intelligence to navigate through tough times. Tobias Platenburg, a consultant at IG&H, shares four steps for how companies can get started.

1) Build a team

First things first, organisations will need to put a data science team in place. This data science team will be crucial to leverage the right data, predict what will happen next, and play a defining role in building high quality customer relationships from a distance. Typically, a well performing data science team includes people with complementary skillsets that span knowledge of at least statistical modelling, artificial intelligence / machine learning, data engineering and project and people management.

Even if companies already have a team in place, it is recommended to review their positioning. Often, the impact of the team is optimal when positioned close to the board room where strategic decision making takes place. From there they can be deployed on high-impact projects where data science will be of additional value besides traditional analyst and business intelligence roles.

Four steps for getting started with data science

2) Make data available

Firms must democratise insensitive data in the company. Cross-department data should be available for your data science team, and the rest of the organisation, something which will increase transparency and enhance collaboration over departments. To make that work in terms of technology, it might help to move a portion of a firm’s data to the cloud and consider NoSQL solutions like Apache Cassandra or MongoDB to make data available to many interfaces.

Additionally, Covid-19 has changed society and individual values and norms. To stay ahead, organisations will need to rethink which data to consume and record within the company, and experiment accordingly. Do not be ‘evil’ about it either; in the age of ethical consumerism, companies need to rethink and tune their ethical approach on data collection to improve their products and services constantly to the ‘new normal’.

3) Generate scenarios and model

Every use of data science ultimately depends on models being properly implemented. The business and its data science team need to know as soon as possible when accuracy drifts. As such, it is recommended that lead data scientists implement automated retraining of models, using solutions such as Kubeflow, Azure ML, or AWS Sagemaker. Although some human intervention may sometimes be required, it ensures that models are updated regularly using the latest data.

Second, companies should implement and apply models that require less data or do not need any data. With low data, use simple machine learning models to avoid inaccurate models by ‘overfitting’. Overfitted models fit very well to a small training dataset, but also feature relations that are non-existent in the real world. Organisations should talk to their data science teams about RidgeRegression, KNN, or Naïve Bayes to work with less data.

Third, companies should consider generating scenario data that might replicate the future ahead. Together with the business, a data science team might be able to generate several future scenarios. If a firm’s data science capability is more advanced, support these scenarios with data by generating the data using a generative deep learning technique (GANs). Obviously though, they should also be aware that pre-Covid-19 data is different to post-Covid-world, so make sure this is incorporated into models.

4) Automate

Automated data processes and reporting for operational efficiency should not be overlooked in uncertain times. They can help shift work away from repetitive tasks to areas where people can be more valuable, while improving speed and quality, because machines make less mistakes.

To realise this win-win scenario for staff and the organisation, first firms should automate low-hanging fruit, where the business case is strongest, using these quick wins to build momentum and automate more processes. Furthermore, companies must make sure to educate their leadership and staff, so they are aware of best practices and can manage expectations on what automation actually delivers.

This plays into the fact organisations should beware assuming that automating is a silver bullet. Companies should not always pursue the most complex solutions; sometimes a simple one may be better in the end.