Learning to Walk Before You Run: Achieving Advanced Analytics Within Insurance


  • Shea Bailey, Penelope Quah, Chris Probert
  • Published: 25 October 2022


With machine learning, artificial intelligence, cloud and other innovations gaining traction within the insurance industry, firms – especially within the general insurance space – are ramping up their investments in data science with a view extracting greater value from their data.

In particular, there are some very exciting insurance-specific use cases that are a good fit for machine learning, such as identifying fraudulent claims, enhancing new customer risk ratings, predicting which customers may lapse on their policies, and intelligent decision-making for automated claim authorisation. Given these and many other potential applications where machine learning can enhance core business processes across general insurance, it is clear why insurance firms are now racing to leverage the benefits of such advanced analytics tools.

However, business challenges can only truly be solved via machine learning when underpinned by strong and trusted data foundations. Accordingly, insurance firms should first be investing – heavily – in their data foundations, building their maturity from the ground up before looking to apply machine learning solutions

Our experience working with a number of tier 1 insurers has shown that different business units within an organization often have their own methods for extracting and transforming data from source systems. This typically results in multiple versions of the ‘truth’ as well as leaving ungoverned and untrusted data sitting in spreadsheets, presentations, extracts and other end user computing tools. In such circumstances, it is unwise to jump straight to advanced analytics before building the appropriate data foundations.

In order to create those foundations, an organisation should begin by aggregating all of their relevant data sources into a single analytics platform. This is often done using a data lake on a cloud platform, such as Microsoft Azure, Amazon Web Services or Google Cloud Platform. Once all relevant data sources – which can sometimes run into the hundreds due to the large number of legacy platforms within established insurers – are ingested into the central platform, they can then be standardized and modelled to produce enterprise data assets. These can then be used for analytics.

Creating such physical data assets demands a heavy but necessary investment in data engineering, data modelling, cloud engineering and solution architecture capabilities. Once data ingestion pipelines are set up, modellers work closely with data, business, and system subject matter experts to create a single version of the ‘truth’, as well as logical data models for key business concepts. For example, within general insurance, modellers may create domains for members, policies and claims which serve as the single, trusted sources for use across all analytics use cases.

To ensure these data domains are trusted and governed, further investment into data management, data governance and data quality capabilities is required. The data management capability will look to map the data lineage and create business data dictionaries to ensure data is trusted. Data modellers and engineers will provide any data quality issues found in the source data to the data quality analysts, who will then work to identify and resolve the root cause at source. Data governance then looks to ensure that there are adequate processes in place for the use of the data, with a specific focus on GDPR and use of personal data.

We have encountered several situations where data scientists devote the majority of their time working on data engineering, modelling tasks and adding another tactical layer to legacy architecture while neglecting to invest in a trusted analytics platform. Insurance firms should look to undertake an initial assessment of their data and analytics maturity, with specific emphasis on understanding their data architecture and enterprise data assets. To ensure a strategic implementation of advanced analytics, insurers should also focus on assessing and investing in foundational aspects such as data engineering, data architecture and data management.

Only then, does it make sense for insurance firms to leverage their data to solve business problems with data science, as the necessary fundamentals are now in place for machine learning to add value.