aureus-insights_logo

All Posts

What’s in an Algorithm Name?

Data Science Willy, as mentioned in a previous blog, is the Data Science cousin of William Shakespeare, and he loves to use and abuse terms and concepts from the literature of his great cousin to drive home points related to Data Science. More often than naught, he is annoying in being reverential. For instance, Data science Willy would be at absolute loggerheads with literature’s The Great William Shakespeare over the question “What’s in a name?”, especially if we look at “What’s in an algorithm?” or “What’s in an algorithm’s name?”.

What’s in a name? Literature Will quipped this very question via Juliet. Juliet, we know, is in love and is finding philosophical means to ignore the raging conflict of being in love with a member of a family with whom her family has been in blood and animosity drenched feud.

Humans are named mostly in their infancy when their major accomplishments include being true to the primal nature of a mammal, howling, crying, etc. The name bestowed on them is more aspirational and predictive. The only scientific help being from astrological sciences. I have always held the view that horoscope from various cultures was the first formal predictive model based on behavioral analysis (but this is a topic for another boring blog).

Romeo literally means a citizen of Rome, which makes sense since the tragedy was based in Verona, Italy - definitely a part of the Roman empire. Juliet means “youthful,” a beautiful but fairly short-sighted name. Imagine the ironic plight of an old woman named “Juliet.” Now, before die-hard fans of Literature Will pull out their daggers, I will switch to the Data Science Willy context.

Naming the Algorithm

Old Angel Oak Tree 1256x838

We need to be clear that algorithms constitute predictive models, but the two terms cannot be used interchangeably. A model is much more than an algorithm but cannot do without an algorithm. If the predictive model is a car, then the algorithm is the engine, and data is the fuel.

Machine learning algorithms, compared to the naming of human babies, are carefully named based on established (not predictive) facts of what they do and how they work. For example, the decision tree is a supervised machine learning algorithm that can be used to create various models. As the name suggests, a decision tree is a tree-like structure organized into branches and terminal nodes (leaves) to represent decision paths and decisions (classifiers or values).

Through machine learning algorithm, this tree-like structure is derived from existing data that contains known decisions (supervised machine learning with labeled training data) to create decisions for data records where prediction is needed. When the decision is a classifier, then we have a classification tree. When the decision is values, we have a regression tree.  The umbrella term to include both approaches is known as CART (Classification and Regression Tree). The decision path which helps arrive at a node is the key.

In complicated systems, the number of possible decision paths are so many that a single tree is not sufficient, and a group of trees proves more effective. It is not too difficult to guess the name of the resultant approach; it has something to do with the woods, a synonym perhaps; yes, forest it is. Since the formation of trees from data using machine learning is randomized using various approaches, the algorithm is named “Random Forest.”

Naming the Approach or Model

Sometimes people who have determined the approach or have provided a sound theoretical base for the advancement of math and ultimately to the formation of an approach are honored by including their name in the algorithm. For example, Thomas Bayes is revered in Naïve Bayes by calling him naïve. The bad joke being forgiven, the naivety depicts the independence of features and not anything else.

The name for Hidden Markov models does not imply that Andrey Markov is hidden somewhere in the models, but the fact that the underlying system is assumed to have unobservable states contributing to a Markov process. Andrey Markov defined the Markov chain/process; therefore, every other approach based on Markov chains honors Andrey by using his name in the approach. Markov chains have a wide range of applications in machine learning, and hidden Markov has seen much success in reinforcement learning areas.

Data Science Willy, I am sure, is happy that machine learning algorithms are not named like human babies. They are factual rather than being doting and aspirational. Sometimes so factual that the name can completely describe the algorithm at the cost of totally alienating non-tech audience (defined as people who are alien to this science; so, it makes sense!). A few examples:

  1. Three hidden-layer fully connected feedforward artificial neural network – this was the name encompassing the whole sentence
  2. Long-term, short-term memory: should we forget this or should we not remember this because it is easy but confusing
  3. Latent Dirichlet Allocation: This is an excellent topic to model for this naming conundrum
  4. VAEGAN (Variational Autoencoder Generative Adversarial Network): Healthy rivalry between adversarial networks; if you deem so.

Conclusion

Coming to the leading protagonist of literature’s The Great Will’s tragedy Romeo and Juliet, yes Romeo, we are talking about him. His name should never be used in the same sentence in which the word learning appears (and I break the rule while stating it). Romeo never learned the lesson of love’s hopeless helpless but heartful “happinesslessness” (new word) with his unexpressed love for Rosaline (yup, if you read your Romeo and Juliet well you will know that there was Rosaline before Juliet) and if he would have survived Juliet alive, then there would probably have been a  Jasmine.

However, Data Science Willy learns a lot from The Great Will, so Willy restates:

“Algorithms may fall when there’s no strength in data.”

“Prediction is a smoke made with fumes of past patterns.”

“These analytical delights have predictive ends.”

Interested in learning how Aureus can help you leverage machine learning to predict your customer's behavior? Click on the link below to get more information.

More Information

Nitin Purohit
Nitin Purohit
Nitin is CTO and co-founder at Aureus. With over 15 years of experience in leveraging technology to drive and achieve top-line and bottom-line numbers, Nitin has helped global organizations optimize value from their significant IT investments. Over the years, Nitin has been responsible for the creation of many product IPs. Prior to this role at Aureus, Nitin was the Global Practice Head for Application Services at Omnitech Infosolutions Ltd and was responsible for sales and profitability of offerings from application services across geographies.

Related Posts

Transfer Learning: A New Age of Machine Learning

In recent years, Machine Learning (ML) algorithms have advanced and are now capable of learning accurate and complex patterns provided large and labeled data samples are available. However, many ML implementations fail to generalize when new data points are encountered, especially data points with different and unseen patterns or conditions from training samples.

Trust in the Evolution of the Customer's Journey

This is part 1 of a 2-part series "Trust: The Key Ingredient for a Successful Insurance Customer Journey." Today, everyone in the business world is talking about the customer journey and experiences starting from E-commerce, banking, and many other industries. So, what is customer experience? What is new about it?

3 Ways to Target the Right Customers in the Insurance Industry

This is Part 3 of our blog series, "Data Science Use Cases in Insurance." The insurance industry isn’t the same as it was 20 years ago. It has become much more competitive as tech companies come into the picture with new and innovative ways to compete in order to gain a foothold in the insurance industry. Consumers want to save money and will make their decisions based on the lowest price available. Some websites will help the consumer compare carriers’ prices and offerings to choose the best deal. Unfortunately, this is causing insurance companies to make price their priority over quality and customer satisfaction.