Data Science Willy, as mentioned in a previous blog, is the Data Science cousin of William Shakespeare, and he loves to use and abuse terms and concepts from the literature of his great cousin to drive home points related to Data Science. More often than naught, he is annoying in being reverential. For instance, Data science Willy would be at absolute loggerheads with literature’s The Great William Shakespeare over the question “What’s in a name?”, especially if we look at “What’s in an algorithm?” or “What’s in an algorithm’s name?”.
What’s in a name? Literature Will quipped this very question via Juliet. Juliet, we know, is in love and is finding philosophical means to ignore the raging conflict of being in love with a member of a family with whom her family has been in blood and animosity drenched feud.
Humans are named mostly in their infancy when their major accomplishments include being true to the primal nature of a mammal, howling, crying, etc. The name bestowed on them is more aspirational and predictive. The only scientific help being from astrological sciences. I have always held the view that horoscope from various cultures was the first formal predictive model based on behavioral analysis (but this is a topic for another boring blog).
Romeo literally means a citizen of Rome, which makes sense since the tragedy was based in Verona, Italy - definitely a part of the Roman empire. Juliet means “youthful,” a beautiful but fairly short-sighted name. Imagine the ironic plight of an old woman named “Juliet.” Now, before die-hard fans of Literature Will pull out their daggers, I will switch to the Data Science Willy context.
We need to be clear that algorithms constitute predictive models, but the two terms cannot be used interchangeably. A model is much more than an algorithm but cannot do without an algorithm. If the predictive model is a car, then the algorithm is the engine, and data is the fuel.
Machine learning algorithms, compared to the naming of human babies, are carefully named based on established (not predictive) facts of what they do and how they work. For example, the decision tree is a supervised machine learning algorithm that can be used to create various models. As the name suggests, a decision tree is a tree-like structure organized into branches and terminal nodes (leaves) to represent decision paths and decisions (classifiers or values).
Through machine learning algorithm, this tree-like structure is derived from existing data that contains known decisions (supervised machine learning with labeled training data) to create decisions for data records where prediction is needed. When the decision is a classifier, then we have a classification tree. When the decision is values, we have a regression tree. The umbrella term to include both approaches is known as CART (Classification and Regression Tree). The decision path which helps arrive at a node is the key.
In complicated systems, the number of possible decision paths are so many that a single tree is not sufficient, and a group of trees proves more effective. It is not too difficult to guess the name of the resultant approach; it has something to do with the woods, a synonym perhaps; yes, forest it is. Since the formation of trees from data using machine learning is randomized using various approaches, the algorithm is named “Random Forest.”
Sometimes people who have determined the approach or have provided a sound theoretical base for the advancement of math and ultimately to the formation of an approach are honored by including their name in the algorithm. For example, Thomas Bayes is revered in Naïve Bayes by calling him naïve. The bad joke being forgiven, the naivety depicts the independence of features and not anything else.
The name for Hidden Markov models does not imply that Andrey Markov is hidden somewhere in the models, but the fact that the underlying system is assumed to have unobservable states contributing to a Markov process. Andrey Markov defined the Markov chain/process; therefore, every other approach based on Markov chains honors Andrey by using his name in the approach. Markov chains have a wide range of applications in machine learning, and hidden Markov has seen much success in reinforcement learning areas.
Data Science Willy, I am sure, is happy that machine learning algorithms are not named like human babies. They are factual rather than being doting and aspirational. Sometimes so factual that the name can completely describe the algorithm at the cost of totally alienating non-tech audience (defined as people who are alien to this science; so, it makes sense!). A few examples:
Coming to the leading protagonist of literature’s The Great Will’s tragedy Romeo and Juliet, yes Romeo, we are talking about him. His name should never be used in the same sentence in which the word learning appears (and I break the rule while stating it). Romeo never learned the lesson of love’s hopeless helpless but heartful “happinesslessness” (new word) with his unexpressed love for Rosaline (yup, if you read your Romeo and Juliet well you will know that there was Rosaline before Juliet) and if he would have survived Juliet alive, then there would probably have been a Jasmine.
However, Data Science Willy learns a lot from The Great Will, so Willy restates:
“Algorithms may fall when there’s no strength in data.”
“Prediction is a smoke made with fumes of past patterns.”
“These analytical delights have predictive ends.”
Interested in learning how Aureus can help you leverage machine learning to predict your customer's behavior? Click on the link below to get more information.