Machine Learning Use Cases and Choosing The Right Algorithm
Machine Learning has taken the world by storm, as one of the technologies which will become commonplace in the future. But what is Machine Learning really?
Machine Learning (ML) has taken the world by storm as it is one of the most rapidly advancing forms of technology. It aims to convert information (data) into knowledge that can be used to make informed decisions. In this blog, we will understand the definition of machine learning, some of its applications and how we can choose the right algorithm based on our requirements.
Machine learning is a subset of Artificial Intelligence (AI) that consists of various algorithms capable of learning from the data being fed to them without being specifically programmed for a task. This ability to learn from data allows the algorithms to create models that can solve complex data problems by finding patterns in historical data and improving them as new data is fed to the models.
There are different ML algorithms which use different approximations to solve a task (such as probability functions), but the key element is that they can consider a countless number of variables for a particular data problem, making the final model better at solving the task than humans are. The models that are created using ML algorithms are created to find patterns in the input data so that those patterns can be used to make informed predictions in the future.
Some of the popular tasks that can be solved using ML algorithms are price/demand predictions, product/service recommendations, and data filtering, among others. The following is a list of real-life examples of such tasks:
• On-demand price prediction: Companies whose services vary in price according to demand can use ML algorithms to predict future demand and determine whether they will have the capability to meet it. For instance, in the transportation industry, if future demand is low (low season), the price for flights will drop. On the other hand, if demand is high (high season), flights are likely to increase in price.
• Recommendations in entertainment: By analyzing your music listening habits, and that of others in your demographic, ML algorithms can construct models capable of suggesting new records that you may like. That is also the case with video streaming applications, as well as online bookstores.
• Email filtering: ML has been used for a while now in the process of filtering incoming emails to separate spam from your desired emails. Lately, it also has the capability to sort unwanted emails into more categories, such as social and promotions.
Considering the diverse use cases mentioned here, while developing Machine Learning solutions, it is important to highlight that, often, there is not one solution for a data problem, much like there is no algorithm that fits all data problems. Additionally, there is a large number of algorithms in the field of ML, and choosing the right one for a certain data problem is often the turning point that separates outstanding models from mediocre ones.
The following steps can help narrow down the algorithms to just a few:
Understand your data: Considering that data is the key to being able to develop any ML solutions, the first step should always be to understand it to be able to filter out any algorithm that is unable to process such data. For instance, considering the number of features and observations in your dataset, it is possible to determine whether an algorithm capable of producing outstanding results with a small dataset is required. The number of instances/ features to consider in a small dataset depends on the data problem, the quantity of the outputs, etc. Moreover, by understanding the types of fields in your dataset, you will also be able to determine whether you need an algorithm capable of working with categorical data.
Categorizing the data problem: As per the following diagram, in this step, you should analyze your input data to determine if it contains a target feature (a feature whose values you want to be modelled and predicted) or not.
Datasets with a target feature are also known as labelled data and are solved using supervised learning (A) algorithms. On the other hand, datasets without a target feature are known as unlabeled data and are solved using unsupervised learning (B) algorithms.
Moreover, the output data (the form of output that you expect from the model) also plays a key role in determining the algorithms to be used. If the output from the model needs to be a continuous number, the task to be solved is a regression problem (C). On the other hand, if the output is a discrete value (a set of categories, for instance), the task at hand is a classification problem (D). Finally, if the output is a subgroup of observations, the process to be performed is a clustering task (E):
Figure: Demonstrating the division of tasks
Choose a set of algorithms: Once the preceding process is completed, we can filter out the algorithms that perform well over the input data and that are able to arrive at the desired outcome. Depending on your resources and time limitations, you should choose from this list of apt algorithms the ones that you want to test out over your data problem, considering that it is always good practice to try more than one algorithm.
We hope these different use cases help you choose the right machine-learning algorithm for your next project.



