What is Machine Learning? – Beginners Edition

A men is touching a futuristic board

Ever heard of Machine Learning? Ever wondered what it means?

Well, you’re at the right place to find out.

So what is Machine Learning?

“Machine learning is teaching computers to learn from data and to improve with experience – instead of being explicitly programmed to do so.”

So empirical learning, correct?

Let’s look at this example if you, as a human being, reads this suite of figures:

11
23
36
410
515
6??

You inferred 21, right?  Why?

Because:

11
232+1
363+2+1
4104+3+2+1
5155+4+3+2+1
6216+5+4+3+2+1

The idea is that a Machine can also make the exact same inference. You just need to train it.  How?

Enter the following table into the Machine:

InputOutput
11
23
36
410
515
621

Now we enter this table into the Machine Learning platform, for instance, TADA from MyDataModels. And we ask it to infer the missing figure: what is the output when 7 is the input?

We enter the following data: 

InputOutput
11
23
36
410
515
621
7??

 And we get the following result:

PredictionInput
0,99999999999999911
2,99999999999999822
5,99999999999999983
9,99999999999999964
14,99999999999999965
20,99999999999999966
27,99999999999999937

The ‘prediction’ column is the calculation made by the tool about the output after it has learned from the initial data. Just like you did, the tool inferred that the ‘prediction’ for 7 was 28 (27.99999999).

The computer was not explicitly programmed to deliver this result. It was fed a data set, used as training data, and learned from it.  

How can a computer learn anything?

Ouch, this one hurts. You caught me: a computer does not learn, an algorithm learns. 

What is an algorithm?

It is a live entity that hides within a computer program (a piece of software). Surprisingly enough, it has its traits. Machine Learning algorithms are live entities with various characteristics which ‘learn’ from data. In other words, they are ‘specialized’ algorithms.

Some are quicker than others. Some are more precise than others. Some take more computer room (memory) than others. 

What are the various Machine Learning methods and algorithms?

  • Supervised Learning
  • Unsupervised Learning
  • Semi-supervised Learning

Supervised Machine Learning

It finds patterns (and develops predictive models) using both input data and output data. Output data is also called labeled data. It is a part of the data which human operators have created. For instance, to train a machine-learning algorithm to recognize the color orange, you need to provide millions of images, i.e., big data, each manually tagged with their color. Without this ‘indication’ supplied to the artificial intelligence algorithm, it cannot learn nor guess what the color orange is. 

TADA from MyDataModels implements supervised machine learning algorithms. 

It is possible to predict discrete values. 

What is a Discrete Value? Classification.

It is a variable that has a finite number of values. Examples of such discrete values are: 0 or 1, boy or girl; win, lose or draw; black or white; spades, heart, diamond, club. On social media, a discrete value can be a ‘like/dislike’ pair. 

Let’s take the list of all the football matches played in the first division in France between 2019 and 2020. 

Here is a sample of this list:

DivDateTimeHomeTeamAwayTeamFTHGFTAGFTR
F109/08/201919:45MonacoLyon03A
F110/08/201916:30MarseilleReims02A
F110/08/201919:00AngersBordeaux31H
F110/08/201919:00BrestToulouse11D
F110/08/201919:00DijonSt Etienne12A
F110/08/201919:00MontpellierRennes01A
F110/08/201919:00NiceAmiens21H
F111/08/201914:00LilleNantes21H
F111/08/201916:00StrasbourgMetz11D

The last column indicates the result: Home win (H), Away Win (A), Draw (D), i.e., three classes. The values are discrete. We want to experiment with these data in two steps. First, we train the machine learning model with the complete data set, including the results of the matches. This complete set is the ‘learning data.’ Then we hide the outcome of these matches to the algorithm and ask an AI using supervised machine learning to guess them. The resulting machine learning model is called a classification algorithm. 

Secondly, we enter this spreadsheet into TADA and remove the ‘result’ (FTR) column. Then we ask TADA to predict the outcomes using the previously trained higher-level model. It does so with excellent accuracy. You can see the win, loss, and draw predicted by the higher-level model displayed in the TADA result table below. Furthermore, TADA provides the probability of the result being correct, respectively, in the columns prediction and probability.

predictionprobabilityDivDateTimeHomeTeamAwayTeamFTHGFTAGFTR
A0.9982329634804694F19/8/1919:45MonacoLyon03
A0.9771932104599848F110/8/1916:30MarseilleReims02
H0.972417723256133F110/8/1919:00AngersBordeaux31
D0.5996559880288891F110/8/1919:00BrestToulouse11
A0.838617438492858F110/8/1919:00DijonSt Etienne12
A0.7908113276401231F110/8/1919:00MontpellierRennes01
H0.7812949951878015F110/8/1919:00NiceAmiens21
H0.7876152984758582F111/8/1914:00LilleNantes21
D0.607046584376302F111/8/1916:00StrasbourgMetz11

There is a second category of potential predictions in the world of supervised learning for continuous values.

Continuous Value? Regression

For continuous values, the higher level ai learning models generated belong to the ‘regression’ category. Examples of such regression models can be predicting the weight of a person based on height, age, gender, or the prediction of tomorrow’s midday temperature.

Unsupervised Machine Learning?

It finds patterns based only on input data. Learning data is provided, but it is unlabeled data. There is no need for an operator to ‘tag’ the data. This technique is useful when you’re not quite sure what to look for. The algorithm learns through trial and error the existing categories.   

Here is an example. It is the last day of school, and all the schools took all their classes to the zoo. The children belonging to each class wear the same tee-shirt. A child gets lost, so you ask a machine-learning algorithm to find his class. 

What kind of algorithm is going to be used?

The answer is supervised machine learning. Because the output, i.e., the result (the class), is indicated through the label, i.e., the tee-shirt. 

Another example: same situation all school day out to the zoo. The students are dressed as usual. They all get lost. We ask a machine-learning algorithm to create ‘classes.’ 

It will cluster together students with similarities according to its logic.

Possibly according to height, weight, gender. It will not find the initial classes back because it had no information whatsoever about these. But it will provide new insights.

In a Nutshell

Machine learning provides a higher-level view of data. As a part of artificial intelligence, it can provide insights, inferences, and correlations in a few clicks. Whether implemented using deep neural networks or genetic algorithms (such as TADA), it is used in a wide range of applications. It expands our understanding of the underlying rationales hidden in our data. 

References

https://becominghuman.ai/machine-learning-for-dummies-explained-in-2-mins-e83fbc55ac6d

https://medium.com/@randylaosat/a-beginners-guide-to-machine-learning-dfadc19f6caf

https://insights.sap.com/what-is-machine-learning/

Need support ?

Questions? Problems? Need more info? Contact us, and we can help!

Was this page helpful?

On this page

Was this page helpful?

Start making sense of  your data

Test easily TADA with our test data here