All Posts

Data-Driven Insurance: The Challenges Faced by Data Scientists

In my earlier blog, “Data-Driven Insurance: The New Normal in the Post-Pandemic World,” I concluded that in the future, the insurance industry would be data-driven. Not only data-driven, but we’ll see the use of emerging technologies like Artificial Intelligence, Image Recognition, and Natural Language Processing BOTs will be the buzz words in the corridors of the Insurance industry.

In this article, we will try to understand the challenges faced by Data Scientists and what is expected from them.


A data scientist is expected to answer a business problem by analyzing past behavior, i.e., data concerning a stated business problem. The solution should make economic sense by adding value to the organization in terms of cost reduction, risk mitigation, or effective and optimal utilization of the organization's resources.

Communication and Understanding of Business Terms

The biggest and foremost challenge is communication and fulfilling the expectation of the business (functional) teams. By the very nature of their specialization and training, data scientists understand the data well and their correlation, such as how one data field impacts the other data fields.

We can say that if the data scientist has an input of some factors, he can predict the values for some other data fields influencing the business transaction. Having said that, he has just a basic understanding of the business transaction. On the other hand, people involved in business transactions and have in-depth knowledge of the business are always in a hurry.

First, the business problem is often not stated as sharp and crisp as required by the data scientist and can change midway through the solution design. Usually, they expect immediate results, while any solution typically takes 2 to 3 months to offer an effective solution and learn from the transaction. Only after this time will an economic value is realized. By this time, the business folks may have lost their interest and stopped the project's support.


The most significant bread and butter for any data scientist is data. Let us have a glance at this, whether the data scientist is getting all the raw material he needs to deliver the results. There is always a conflict between the two teams. Business teams say the data is available and all shared, while the analyst says we do not. Let us examine some of the reasons.

Fragmented Data

Any business does not have a single integrated technology platform. The data available for any business process is spread on multiple applications. For example, consider an insurance customer onboarding. Half the process is on the Business Process Management (BPM) and the other on the Policy Administration System (PAS). In between, it interfaces with many other applications.

Timely Availability

Some data sources may originate at multiple locations, some possible in Tier 3 or rural areas. The requisite bandwidth and high-speed internet connectivity may not be available to run the business application. These locations depend on emails or Excel files or notes in Word files. The data may be fragmented and need timely collation to ensure completeness and data integrity. Providing it to the data scientist in a readable format becomes a task for the technology team or business teams, resulting in a delay in delivering the data.

Legacy Data

A contract with the customer can last anywhere from 1 year to 15 or 20 years in the insurance business. Data fields that are mandatory today may not exist 20 years ago, like India’s Aadhaar number, which is used for identity confirmation today. This kind of scenario becomes a challenge when a data scientist is asked to mine the data and develop a recommendation for cross-sell to harness the full potential of the customer’s database, relationship, and loyalty.

Data Relevance

For a useful, actionable analytic output, an analyst should have a good amount of data for a reasonable period of time to build and test his hypothesis and identify the variables, parameters, or factors affecting the given outcome before results are declared as a business actionable hypothesis or model. Even sometimes, these simple requirements are not met.

Natural Disasters Causing Unpredictable Results


India has had a frequent occurrence of these events from 12th April 2020 to 19th June 2020, but the losses are unpredictable, as an earthquake is unpredictable. Insurance companies and data scientists struggle with big unanswered questions such as:

1. Premium: increase or maintain the status quo of Fire and Marine for Delhi NCR?

2. If the change is required, when should the period be effective, and should it be for a specific time, like when the risk is high?


Cyclones happen infrequently and are another natural disaster that is hard to predict. Odisha, India, experienced Cyclone Amphan in May. It was the first ‘super’ storm since 1999, killing 72 people in West Bengal and causing over $13 billion of damage. Cyclone Nisarga was expected to hit Mumbai but landed on the western coastline. This was the first cyclone in 129 years landing in the western Indian coastline instead of the usual direction of Oman.

This cyclone pattern is the opposite of the earthquake scenario. Data scientists know the factor causing the events and their historical data. Still, they have not occurred frequently to build a stable model to convert into actionable business solutions for defining the risk territory/area.

Solutions to Consider

For solving challenges similar to the examples mentioned above regarding cyclones, enterprises are looking at start-up companies that specialize in AI solutions to address this type of scenario. Organizations are partnering up with them and/or investing in their companies to help financially as well as developing the right skill set. There are many advantages to this kind of partnership.

Core Competencies

One of the core competencies of Start-up companies that specialize in AI-based solutions is data science, versus an insurance company, for example, whose core competency is risk management. Do insurance companies hire data scientists? Absolutely, but does the environment within an insurance company provide the data scientist with the necessary lateral thinking and experimentation required to solve specific business problems?

Another core competency of start-up companies specializing in AI-based solutions is industry domain experience. Organizations can benefit from partnering with AI-based solution providers that specialize in the insurance industry from the experience the company has having worked with multiple insurers.

  • Lateral Thinking and Experimentation: The environment itself motivates and rewards people who think out of the box and are willing to experiment. Data scientists experiment not only on new problem statements but keep challenging and reviewing the existing production implementation. They work without the fear of failure. This keeps enhancing the enterprise value with which the start-up is associated. This culture of risk-taking and exploring alternative solutions is not found in every organization, especially insurance companies.
  • Problem solver, not the Problem Stater: They are not only problem solvers but are willing to challenge the existing implementation. In this environment, the data scientist is not obsessed with predicting events of loss but takes an unorthodox route by starting to think like an entrepreneur.He believes if the loss cannot be prevented (damage to crops resulting in huge claims), how can AI be used to avoid crop loss and increase land productivity. This kind of attitude and solution resolves the problem of increasing claims in the crop insurance sector, which is a social sector product already cross-subsidized.   
  • Learning and Development: The organization can develop a talent pool by continuous upskilling and training sessions. Being smaller in the count, the cost remains under control, and the process is effectively managed and monitored for learning abilities and skills retrained. This training is both functional (business and technical)
  • Technical skills: In these start-up organizations, talent is hired and then nurtured very carefully. On the technical skills, they are not just coders who know to code a given algorithm for all the solutions but have spent time studying what the model is trying to achieve and behave in the background. They can decipher the model outputs in statistical terms and convert those in business results with the help of business experts. This is sharpened with the hard work of years spent in the formal educational system and on-the-job training.


When you consider the effort and time required to build an in-house data scientist team, it is no surprise that more organizations are partnering with AI specialist start-ups. These AI specialist start-ups complement the internal capabilities of an organization to speed up the pace of digitalization, enhance the customer experience on the personalized content, and Omni channels presence.

Interested in learning how Aureus can help you leverage machine learning and time tested models to boost your risk control, productivity, capital efficiency in insurance operations? Click on the link below to get more information.

More Information

Arun Agarwal
Arun Agarwal
Arun is a senior executive with over 24 years of experience in the BFSI sector. He has worked with leading insurers like Aviva and Pramerica Life and has led teams towards the achievement of the organization's long & short terms business goals. As an insurance domain expert, Arun has made significant contributions by driving excellence across operations, sales management and compensation policies and execution, financial planning & business strategy, MIS and regulatory reporting functions.

Related Posts

Data and Innovation: 2 Sides of the Same Coin

As we set our feet in 2023, having experienced a roller-coaster ride last year thanks to the geopolitical tensions and some lingering rub-off effects of COVID-19, it drives home that "change is the only constant." Like any other industry, insurance is undergoing paradigm changes at different levels, whether recruiting potential candidates or customer onboarding, to name a few. However, a common thread that ties the myriad business functions of an insurance company has been data and innovation. There has been an ever-increasing need for insurance providers to use data and embrace innovation in their routine activities, eventually to stand the cut-throat competition.

Intelligent Risk Assessment in Insurance

Risk Management is a core function within the insurance industry. It is a vital responsibility of the underwriting team. Insurance companies collect data scattered across different business units in various formats – some of which are paper and digital, most of which are typically unstructured. The underwriting team doesn't have immediate access to the information required for internal and external decision-making, resulting in delays in making decisions and costly mistakes.

Why Does the Long-term Nature of Life Insurance Products Make Customer Retention Difficult?

Most insurers offer similar products and services, which makes it challenging to attract new customers and retain them. As an industry, insurance is low-touch, and insurers seldom interact with their customers. A report shows that the top companies have an average customer retention rate of 93 - 95 percent, while insurance companies have an average of 84 percent.