aureus-insights_logo

All Posts

What’s in an Algorithm Name?

Data Science Willy, as mentioned in a previous blog, is the Data Science cousin of William Shakespeare, and he loves to use and abuse terms and concepts from the literature of his great cousin to drive home points related to Data Science. More often than naught, he is annoying in being reverential. For instance, Data science Willy would be at absolute loggerheads with literature’s The Great William Shakespeare over the question “What’s in a name?”, especially if we look at “What’s in an algorithm?” or “What’s in an algorithm’s name?”.

What’s in a name? Literature Will quipped this very question via Juliet. Juliet, we know, is in love and is finding philosophical means to ignore the raging conflict of being in love with a member of a family with whom her family has been in blood and animosity drenched feud.

Humans are named mostly in their infancy when their major accomplishments include being true to the primal nature of a mammal, howling, crying, etc. The name bestowed on them is more aspirational and predictive. The only scientific help being from astrological sciences. I have always held the view that horoscope from various cultures was the first formal predictive model based on behavioral analysis (but this is a topic for another boring blog).

Romeo literally means a citizen of Rome, which makes sense since the tragedy was based in Verona, Italy - definitely a part of the Roman empire. Juliet means “youthful,” a beautiful but fairly short-sighted name. Imagine the ironic plight of an old woman named “Juliet.” Now, before die-hard fans of Literature Will pull out their daggers, I will switch to the Data Science Willy context.

Naming the Algorithm

Old Angel Oak Tree 1256x838

We need to be clear that algorithms constitute predictive models, but the two terms cannot be used interchangeably. A model is much more than an algorithm but cannot do without an algorithm. If the predictive model is a car, then the algorithm is the engine, and data is the fuel.

Machine learning algorithms, compared to the naming of human babies, are carefully named based on established (not predictive) facts of what they do and how they work. For example, the decision tree is a supervised machine learning algorithm that can be used to create various models. As the name suggests, a decision tree is a tree-like structure organized into branches and terminal nodes (leaves) to represent decision paths and decisions (classifiers or values).

Through machine learning algorithm, this tree-like structure is derived from existing data that contains known decisions (supervised machine learning with labeled training data) to create decisions for data records where prediction is needed. When the decision is a classifier, then we have a classification tree. When the decision is values, we have a regression tree.  The umbrella term to include both approaches is known as CART (Classification and Regression Tree). The decision path which helps arrive at a node is the key.

In complicated systems, the number of possible decision paths are so many that a single tree is not sufficient, and a group of trees proves more effective. It is not too difficult to guess the name of the resultant approach; it has something to do with the woods, a synonym perhaps; yes, forest it is. Since the formation of trees from data using machine learning is randomized using various approaches, the algorithm is named “Random Forest.”

Naming the Approach or Model

Sometimes people who have determined the approach or have provided a sound theoretical base for the advancement of math and ultimately to the formation of an approach are honored by including their name in the algorithm. For example, Thomas Bayes is revered in Naïve Bayes by calling him naïve. The bad joke being forgiven, the naivety depicts the independence of features and not anything else.

The name for Hidden Markov models does not imply that Andrey Markov is hidden somewhere in the models, but the fact that the underlying system is assumed to have unobservable states contributing to a Markov process. Andrey Markov defined the Markov chain/process; therefore, every other approach based on Markov chains honors Andrey by using his name in the approach. Markov chains have a wide range of applications in machine learning, and hidden Markov has seen much success in reinforcement learning areas.

Data Science Willy, I am sure, is happy that machine learning algorithms are not named like human babies. They are factual rather than being doting and aspirational. Sometimes so factual that the name can completely describe the algorithm at the cost of totally alienating non-tech audience (defined as people who are alien to this science; so, it makes sense!). A few examples:

  1. Three hidden-layer fully connected feedforward artificial neural network – this was the name encompassing the whole sentence
  2. Long-term, short-term memory: should we forget this or should we not remember this because it is easy but confusing
  3. Latent Dirichlet Allocation: This is an excellent topic to model for this naming conundrum
  4. VAEGAN (Variational Autoencoder Generative Adversarial Network): Healthy rivalry between adversarial networks; if you deem so.

Conclusion

Coming to the leading protagonist of literature’s The Great Will’s tragedy Romeo and Juliet, yes Romeo, we are talking about him. His name should never be used in the same sentence in which the word learning appears (and I break the rule while stating it). Romeo never learned the lesson of love’s hopeless helpless but heartful “happinesslessness” (new word) with his unexpressed love for Rosaline (yup, if you read your Romeo and Juliet well you will know that there was Rosaline before Juliet) and if he would have survived Juliet alive, then there would probably have been a  Jasmine.

However, Data Science Willy learns a lot from The Great Will, so Willy restates:

“Algorithms may fall when there’s no strength in data.”

“Prediction is a smoke made with fumes of past patterns.”

“These analytical delights have predictive ends.”

Interested in learning how Aureus can help you leverage machine learning to predict your customer's behavior? Click on the link below to get more information.

More Information

Nitin Purohit
Nitin Purohit
Nitin is CTO and co-founder at Aureus. With over 15 years of experience in leveraging technology to drive and achieve top-line and bottom-line numbers, Nitin has helped global organizations optimize value from their significant IT investments. Over the years, Nitin has been responsible for the creation of many product IPs. Prior to this role at Aureus, Nitin was the Global Practice Head for Application Services at Omnitech Infosolutions Ltd and was responsible for sales and profitability of offerings from application services across geographies.

Related Posts

Data-Driven Insurance: The New Normal in the Post-Pandemic World

This is the age of data. This pandemic has forced us to find new ways to get our work done without putting ourselves in danger. Consumer buying behaviours have changed in order to adjust to this new normal in the post-pandemic world.  Before we talk about the changes in the business to navigate in the next two years, we should have a glimpse of the insurance business: What, How, and Why of the insurance business.

Understanding Agency Sentiment

In our previous blog article, “Using AI for Increasing Agent Productivity,” we discussed how many insurance companies can only analyze agent productivity based on the premiums written and the loss ratio of their network of independent agencies. In part 2 of our series of articles on “The Top 3 Emerging Trends for Agent/Advisor Analytics Using AI”,  we will focus on the benefits of understanding agency sentiment for insurance companies that utilize a network of independent agencies.

Using AI for Increasing Agent Productivity

Currently, many insurance carriers can only analyze agent productivity based on the premiums written and the loss ratio of their network of independent agencies. Looking only at past results doesn’t necessarily provide an accurate view of how an insurance carrier can increase agent productivity going forward. By using AI for increasing agency productivity, insurers can now predict the best course of action as opposed to waiting to review past results.