“Change is not only likely, it’s inevitable.” – Barbara Sher What is Data Shift or Data Drift? Given human nature, it is very natural that the data we collect will change over time. Changes in data such as behavior and preferences are fast and drastic. It is even more relevant today with the impact of the Covid-19 pandemic making unprecedented changes to businesses. For a data science practitioner, the stability of data and its source are salient to develop and maintain robust ML (machine learning) solutions. Changes or drift in data will degrade the performance of predictive models.
In recent years, Machine Learning (ML) algorithms have advanced and are now capable of learning accurate and complex patterns provided large and labeled data samples are available. However, many ML implementations fail to generalize when new data points are encountered, especially data points with different and unseen patterns or conditions from training samples.
The insurance industry in India is expected to reach US $280 billion by 2020-2021. The life insurance industry is expected to increase by 14-15% annually during the next three to five years. With such rapid growth in the life insurance market, the number of fraud casess is also expected to increase. Recent reports suggest that fraud consume more than 8.5% of the revenue that the industry generates.