Machine Learning In a Nutshell: When, Why and How
Marketer turned IT project manager
Machine learning (ML) and artificial intelligence (AI) have started to gain traction over the past years, and today, nearly every emerging startup is trying to leverage these technologies to attract funding and disrupt traditional markets. And it’s true that companies using “AI” and “ML” as buzzwords in their pitch are more likely to attract external investments than their counterparts working with traditional and mainstream tech.
But still, apart from all this hype around machine learning, how applicable is it for solving real-life, everyday problems and when does it make sense to use it instead of/together with traditional software programming? Let’s start exploring the issue by describing the various types of machine learning and its basic principles.
Machine Learning vs Traditional Programming
To better understand how machine learning works, let’s look at how it differs from traditional programming.
First of all, machine learning does not replace traditional programming, and a software developer will never use machine learning algorithms to create a website. Usually, machine learning and artificial intelligence complement standard programming methods rather than completely replace them. For example, ML can be used to build predictive algorithms for an online trading platform, while the platform’s UI, data visualization and other components will be implemented in a mainstream programming language such as Ruby, Python, or Java.
The rule of thumb: only use machine learning when traditional programming methods are not effective/feasible for solving a particular problem.
To better exemplify it, let’s consider a classical machine learning problem of exchange rate forecasting and see how it can be solved with the help of both techniques.
Traditional programming approach
In traditional programming, to build a solution, software developers need to create an algorithm and write code. Then they set the input parameters, and the implemented algorithm produces the expected result.
Let’s say we need to predict the currency exchange rate. To complete this task, the algorithm can use a variety of input parameters such as:
- yesterday’s rate;
- yesterday’s rates of other currencies;
- economic changes in the country that issues the currency;
- changes in the global economy, etc.
Thus, using traditional programming, we handcraft a solution that is able to accept a set of parameters and, based on the input data, predict a new exchange rate.
The main problem here is that it is extremely difficult for a person to work with a large number of parameters, whereas their limited set allows to only build a very basic and unscalable model.
Machine learning approach
To solve the same problem using machine learning methods, data engineers take a completely different approach. Instead of developing an algorithm on their own, they collect an array of historical data that will be used for semi-automated model building.
After collecting a sufficient set of data, data specialists feed it into various machine learning algorithms. The outcome is a model that can predict a new result by receiving new data as input.
Data scientists can use various “adjustments” to tweak the learning algorithm and obtain different models. The model that produces the best result goes into production.
Using the finished model is similar to what we get as a result of traditional programming. The model receives input data and produces a result.
The main difference between traditional programming and machine learning is that in machine learning we don’t need to build a model ourselves. This task is performed by ML algorithms; all that data engineer is left to do is deploy minor corrections to algorithm settings.
Another important difference consists in the number of input parameters that the model is capable of processing. For a correct weather forecast in a particular location, you will theoretically need to enter thousands of parameters that will affect the result. A priori, we cannot build an algorithm that will use all of them reasonably. For machine learning, there are no such limitations. As long as you have enough processor power and memory, you can use as many input parameters as you believe would suffice.
Types of machine learning and when to use each
Traditionally, machine learning is divided into supervised, unsupervised and reinforcement. Let’s see how they work and in what cases they apply.
Supervised ML is the most widely used and popular type. The basic idea is that you specify a set of input parameters and a result you expect to get. Thus, you teach the algorithm to provide correct answers.
For supervised ML, data must be labeled. Along with the input parameters, the data should contain answers, aka “labels”. For example, to forecast the exchange rate, the exchange rate value will serve as a shortcut.
Simply put, to create a model using supervised algorithms, we need to ask questions and provide answers. After the model has been built, we can funnel in answers to new questions.
Supervised machine learning can solve two types of problems: classification and regression analysis.
Classification in supervised machine learning
Classification tasks are very common. Such algorithms answer the question of whether something is included in a limited set of answers or not. Let’s say, we have an image and we need to identify an object on it. Is there a cat in the image? And what about a dog? Or when it comes to medical diagnosis, classification determines whether or not the patient has a particular disease.
Using the classification model to define an object
Do you need to define a particular object in the image, to determine whether the review is positive or negative or to tell junk email from a good one? The classification model is a way to go!
Check out our AI/ML solutions and cases here!
The basic algorithm is similar. We need a set of images, texts or data and a set of correct answers for each of them. The machine learning algorithm gets these questions along with answers and builds a model. In the future, the supervised model can do the classification on new data independently.
The standard limitation of classification algorithms is that they can only give answers to the questions they were trained on. For example, if you specified a set of images with cats and tagged them as those with cats, the final model will be able to identify cats on new images. But it won’t be able to identify any dogs on it.
Regression analysis in supervised ML
Classification algorithms work only for those cases where we have a limited set of possible results. They are not suitable for cases where the result should be a number that we are trying to predict.
Let’s return to our example of the exchange rate. We have a number of input parameters and the requirement to predict the numerical value of the exchange rate. As such, the exchange rate is an unlimited set of possible answers.
To solve such problems, there are regression analysis algorithms. To implement them, the data engineer follows the above process. It is necessary to collect data that contains input parameters and correct answers. This data is loaded into the regression analysis algorithm, and it creates a trained model. Having built the model, we can use it to predict new values using new input parameters.
In general, the classification and regression analysis algorithms are very similar and only differ in the potential results they can produce.
The most common use cases for regression analysis are:
- stock price forecasting;
- currency exchange rate forecasting;
- real estate estimation;
- estimation of used cars;
- energy consumption prediction;
- forecasting the demand for goods in retail networks;
- estimation of lots in auctions;
- waiting time estimations, etc
Unsupervised Machine Learning
Unsupervised machine learning is trying to find answers in unlabeled data. In other words, we provide some data, but we do not specify the right answers. Therefore, this type of ML is called “unsupervised” and it implies that the algorithm independently figures something out, without any prior training.
That being said, unsupervised ML does not train a model. Instead, the input parameters are used directly.
In unsupervised ML, there are three categories of classification:
- Dimensionality reduction.
The Apriori algorithm is a very popular solution for associative problems. It allows you to find objects or concepts that are most often used together. Thus, the standard functionality of the “buyers who purchased it, also purchased this” type can be implemented using some of the variations of this algorithm.
The principle of the algorithm:
In general, we need to sort information about products in different baskets and the Apriori algorithm will reveal the most common combinations of products.
This information is useful for retailers since to increase sales, it is possible to place such goods together or even make a group of discounted products.
Clustering allows you to divide data into clusters. One of the most popular algorithms in this category is the K-Means method. For instance, the diagram of K-Means clustering for golf course classification will look as follows:
We need to provide the input data, and the algorithm will group them up. In our example, it uses two parameters, but in reality, it can be a multicomponent grouping.
Clustering has many use cases:
- grouping similar articles in Google News;
- market segmentation for targeting different groups of customers;
- analysis of social graphs to identify groups of friends (in social networks);
- clustering of objects by a set of properties, etc.
Dimensionality reduction: PCA
In some complex machine learning tasks, hundreds or thousands of input parameters are pretty a common case. Processing this amount of data overloads the processor. Is it possible to reduce the amount of input data without significant loss of information?
The Principal Component Analysis, or PCA, fulfills this task perfectly well. Its main idea is as follows:
In our example, PCA finds a way to transform a two-dimensional representation of data into a one-dimensional one. So, instead of two input parameters x and y, it creates a new parameter k, which is a projection from 2D to 1D.
In practice, with thousands of input parameters, PCA can reduce their number by 5-10 times.
Reinforcement Learning (RL)
What’s often labeled as “artificial intelligence” is actually “machine learning” in essence. But reinforcement learning is an exception because most of the time RL works with the AI objective of creating an agent that can perform effective actions in a given environment. RL algorithms use reward as feedback for actions performed and try to maximize it.
Reinforcement Learning began to gain traction after the notorious contest between the AlphaGo AI system developed by Google DeepMind and the Asian champion Lee Sedol. The AlphaGo system was created using RL algorithms. Even the first version of artificial intelligence presented a serious challenge for any human being. The next version – AlphaZero – reached a level of complexity unattainable for people. A distinctive feature of AlphaZero is that it learned to play with itself, without using human parties for training.
At the moment, the main research in RL is aimed at building artificial intelligence for various classic video games without providing a description of the rules of the game. In other words, at first, artificial intelligence knows nothing about the gaming environment and knows only a few actions. By applying these actions, AI gets a response from the game and modifies itself through the mechanism of rewards and punishments.
In addition to computer games, reinforcement learning is very popular for training robots. The main difficulty in the use of RL in robotics is that it is very difficult to simulate the real world with the required accuracy. The resulting AI can ideally perform tasks in a virtual environment but is almost unsuitable in the production environment.
In this article, we looked at three types of machine learning: supervised, unsupervised, and reinforcement. Each of them has areas of practical application in real-world conditions and its own distinctive features.
Supervised ML is by far the most developed and applicable form of machine learning to date. To implement it in practice, you need a task that can be formulated as a problem of classification or regression analysis, as well as a sufficient set of labeled data. Now there are dozens of ready-made classical algorithms for machine learning, as well as various Deep Learning algorithms for solving more complex problems, such as image, text, and voice processing.
On the other hand, unsupervised machine learning is much less applicable in real life. While associative algorithms help in analyzing data for retail and online stores, clustering and dimensionality reduction are more commonly used as an auxiliary tool for supervised ML.
Today, a lot of research is being done to see how neural networks can help recognize complex patterns in unlabeled data. Potentially, they can lead to a breakthrough. Having only some arbitrary data, the unsupervised learning algorithms may be able to detect some non-trivial dependencies or even complex laws in some way.
Reinforcement ML is a very promising tool for solving problems that only a human being can handle. Now, the main research is concentrated around ML usage in various types of games. The main obstacle to using RL in practice is the high complexity of the real world.
This article is just a starting point in our discussion about different types of machine learning and how they can supplement traditional software programming. At 8allocate, we strive to share our knowledge and experience gained through working with various ML and AI methods on clients’ custom development projects. I plan to review different ML and AI tools and libraries in my next article, so stay tuned with 8allocate!