In recent years, Machine Learning (ML) algorithms have advanced and are now capable of learning accurate and complex patterns provided large and labeled data samples are available. However, many ML implementations fail to generalize when new data points are encountered, especially data points with different and unseen patterns or conditions from training samples.
Models are normally developed on carefully constructed datasets. Still, when models are used on real-world applications, there could be multiple scenarios where the model is ill-prepared to make accurate predictions. Conventional machine learning algorithms are developed in isolation, making them vulnerable to data distribution or risk patterns changes. Transfer learning (TL) focuses on overcoming challenges within this learning paradigm by leveraging already acquired knowledge to improve and generalize the learning in a new task.
The process of learning and the transfer of knowledge is central to the development of our species. The ability to transfer knowledge across tasks is an integral part of us humans. We use and reuse acquired knowledge to solve similar tasks. The more related the task, the easier it is for humans to leverage learnings and transfer knowledge to cross-utilize, and we excel in doing so. Transfer learning focuses on this aspect of learning to build intelligent AI solutions.
Introduction to Transfer Learning
In simple terms, transfer learning can be explained as using information gained from one task in another task. The knowledge can be transferred in different ways, like using a pre-trained ML model by itself in a new task, creating learned feature representation based on a previous experience, or learning from simulations and applying the acquired knowledge to real-world problems.
The last two decades saw tremendous progress with supervised ML techniques regarding their predictive capabilities and industry adoption. Supervised algorithms have had significant improvements and have now attained maturity. Machine learning experts believe that the next phase of ML development will focus on transfer learning and its adoption over a wide range of domains. And some opine transfer learning is the key towards attaining Artificial General Intelligence (AGI).
Use Cases on Transfer Learning
The past few years have seen significant improvement in the use of transfer learning techniques in the field of natural language processing and computer vision especially image classification. The adaptation of transfer learning in the industry has been mainly restricted to the following select niche problems.
Image Classification
Machine learning models developed to classify images are complex and training them is computationally expensive. With the development of convolutional deep networks, image classification problems have attained human-level performance. These deep neural networks normally require large labeled datasets and expensive computational resources to train robust classifiers. However, in many scenarios obtaining labeled data is expensive, and the developers will have limited computational resources.
ImageNet is one of the largest open-source datasets of annotated photographs (14M+ images with more than 21K classes) intended for computer vision research. The ImageNet Large Scale Visual Recognition Challenge, or ILSVRC, is an annual competition that uses subsets from the ImageNet dataset and is designed to foster the development and benchmarking of state-of-the-art algorithms.
This has pushed back the frontiers of computer vision and created architectures that can outperform human capabilities. The state-of-the-art models developed from the ImageNet challenge are open source and can be reused in computer vision applications.
Researchers and modelers working on image classification projects use these state-of-the-art models as a starting point and fine-tune them with the data at hand. This develops robust classifiers with a limited labeled dataset and improves the generalizability of custom models.
Natural Language Processing
Text classification is one of the core areas in Natural Language Processing (NLP), building machines that can differentiate between a noun and a verb or detecting sentiment from user reviews. Humans can deal with text data intuitively, but it is neither scalable nor effective and requires smart algorithms to perform these tasks. Text classifications are complex problems and require advanced algorithms trained on a large corpus of text data to build a robust classification engine. The introduction of pre-trained models has completely transformed the practical application of natural language processing and has democratized the development of ML applications.
The adoption of transfer learning through pre-trained models has had an incredible breakthrough in developing strong language models with limited datasets. Further, it also reduces the cost and time required for training such complex models. A data scientist can pick a state-of-the-art pre-trained language model and fine-tune the model. This doesn’t require much-labeled data and makes it versatile for many business problems. Some of the multi-purpose pre-trained NLP models are ULMFit (Universal Language Model Fine-tuning), Google’s BERT, OpenAI’s GPT-2.
Challenges of Extending Transfer Learning to Conventional Predictive Analytics
Homogeneity in input parameters, cost of labeled data, and computational complexity motivated the accelerated use of transfer learning concepts in the field of computer vision and NLP. Even though many businesses have fast-tracked towards data science and the use of predictive models, a significant proportion of these ML solutions are purely based on the conventional ML paradigm of standalone models developed using in-house data. Often there is little to no transfer of knowledge applied across ML solutions within the same organization.
For example, an insurer normally employs multiple ML models, and each might be targeting a specific business case. Models used by insurers will vary from underwriting models predicting negative proposal risk, claim risk models to address portfolio risk, persistency models to improve renewals, revival & surrender models to address customer churn, claim models to identify fraud, product recommendation models for cross-selling, and likewise. More often, these are developed as standalone models, and the only transfer of knowledge is likely from the use of similar input predictors across models. Availability of abundant data, easy implementation of ML models, nonconformity in predictor variables, the difference in business end-user, etc., are some of the factors that are preventing organizations from experimenting with transfer learning concepts.
Conclusion
Research should focus on developing solutions that can leverage transfer learning concepts in solving traditional business problems. An evolved transfer learning architecture for conventional problems will help businesses build AI solutions that can operate invariably across the industry, adapt to changes, improve the understanding of risk influencers, streamline ML solutions, improve adoption, create robust ML models, and so much more.
Small businesses should start exploring the possibility of building AI solutions that can communicate and, over time, advance towards creating a large, intelligent network (brain) capable of assisting businesses with multi-faceted challenges.
Interested in learning how Aureus Analytics is using transfer learning to better understand the sentiment of customer feedback? Click the link below to schedule a demo.