Machine & Deep Learning

Get Started with Machine Learning for Beginners: Your First Concrete Step Today

Machine Learning Beginners: The Essentials in One Article — Real Code, Diagrams, and Concrete Steps, Excerpts from a 44-Lesson Course.

REHOUMA Haythem

12 Jun 2026 • 12 min read

The best way to learn Machine Learning for Beginners is by doing. This article gives you a head start with practical excerpts from a 44-lesson course — enough to get your first result today.

tl;dr

Introduction and First Steps
Learning from Data
The Three Main ML Families
Classification vs Regression
First Model with Orange

~$ cat ./parcours.md # Machine Learning Beginners — 10 chapters

Introduction and First Steps

→ Course presentation and what is ML?→ ML around you — 10 everyday examples+ 1 more lessons

Learning from Data

→ Data, examples and labels→ Finding patterns — visual intuition+ 2 more lessons

The Three Major ML Families

→ Supervised learning — predicting with examples→ Unsupervised learning — finding groups+ 2 more lessons

Classification vs Regression

→ Classification — categorizing things→ Regression — predicting a number+ 2 more lessons

First Model with Orange

→ Install Orange and interface tour→ Load a Titanic dataset and explore it+ 2 more lessons

Evaluating a Model

→ Accuracy — useful but misleading→ Confusion matrix — reading errors+ 2 more lessons

Overfitting and Underfitting

→ Underfitting — the too dumb model→ Overfitting — the model that learns by heart+ 2 more lessons

Business Use Cases

→ Marketing — segmentation and anti-churn→ Finance — credit scoring and fraud+ 1 more lessons

🏁

Final project (+ 2 chapters along the way)

→ You leave with a concrete and demonstrable project

Training vs test — why split?

NOTEObjective — Understand why you must always split your data into two sets (training and test), how this lets you evaluate a model's true generalization ability, and avoid the major pitfall of testing on training data.

Learning objectives

TIPBy the end of this module

Understand the difference between memorizing and generalizing
Know the classic split ratios (80/20, 70/30)
Distinguish training, validation and test sets
Understand cross-validation
Identify the 'data leakage' pitfall

The trap: testing on training data

Imagine a student preparing for an exam. The teacher gives them 50 exercises with solutions and says "study them well". On exam day, the teacher asks the exact same 50 exercises. The student can score 100% without understanding anything: they simply memorized.

This is exactly what happens if you test an ML model on the data it was trained with. An over-parameterized model can "memorize" the examples and reach 100% on the training set while being completely useless on new data.

WARNINGAbsolute rule: data used to train a model must never be used to evaluate it. Without a split, your metrics are misleading.

The solution: the train/test split

The solution is simple: randomly split the dataset into 2 parts before training.

Training set (train)

70 to 80% of the data. Used to train the model. This is the "exercise book with solutions" the student studies.

Test set (test)

20 to 30% of the data. Used to evaluate the model after training. This is the final exam with unseen exercises.

Set	Proportion	Role
Train	60–70%	Train the model's parameters
Validation	15–20%	Tune hyperparameters, compare multiple models
Test	15–20%	Final evaluation, performed only once, at the end

Why three sets? Because if you tune your model by looking at the test results, you end up "over-optimizing" for that specific test: it becomes a form of indirect training.

TIPGolden rule: the test set must be touched only once, at the very end of the project, to produce the official score. All intermediate experiments are done on the validation set.

Cross-validation (k-fold cross-validation)

Problem with a simple train/test split: the result depends on which data ended up in the test set. A bad draw = pessimistic or optimistic metric.

k-fold cross-validation solves this by averaging over multiple splits:

Data leakage: the invisible trap

Data leakage is the most subtle and most common error. It occurs when information from the test set "leaks" into the training set, producing artificially good validation results but disastrous production performance.

Typical examples

How to avoid it

WARNINGCharacteristic symptom: 99% model on validation, 60% in production. It is almost always data leakage.

Visualizing the model and its predictions

NOTEObjective — Visualize the trained decision tree and observe its predictions on new passengers to concretely understand what the model has learned.

Learning objectives

TIPBy the end of this module

Visualize a tree with the Tree Viewer widget
Read the rules learned by the model
Make predictions with the Predictions widget
Complete the first full workflow

Seeing the tree: the Tree Viewer widget

The great advantage of a decision tree is that you can see it. The Tree Viewer widget draws the tree branch by branch, with its questions and answers.

TIPTip: this transparency is a major asset. In a professional context, being able to explain why the model decides is often as important as its accuracy.

Making predictions: the Predictions widget

To apply the model to new cases, use the Predictions widget. It takes two inputs: the trained model and the data to predict.

Finding patterns — visual intuition

NOTEObjective — Intuitively understand what a 'pattern' (recurring motif) is in data, how a machine can detect them visually, and why this detection then enables predictions on new cases.

Learning objectives

TIPBy the end of this module

Define what a pattern is in ML
Visualize a pattern in a scatter plot
Understand the notion of decision boundary
Distinguish a simple (linear) pattern from a complex (non-linear) pattern
Grasp the link between detected pattern and generalization

What is a pattern?

A pattern is a statistical regularity in the data. This is what the machine tries to detect in order to make predictions.

NOTEThe fundamental stake: if the model finds a real pattern (one that repeats in reality), it can reuse it on new data. This is called generalization: applying what has been learned to unseen cases.

Visualization: a scatter plot and its boundary

The simplest way to visualize a pattern: a 2-feature plot. Imagine a flower dataset with 2 features (petal length, petal width) and 2 species (A and B).

TIPThis is the essence of supervised ML: finding a boundary (or a function) that correctly separates or predicts the observed examples, hoping it will also work on future examples.

Linear vs non-linear patterns

Not all patterns have the same complexity.

Linear pattern

The boundary is a straight line (or a plane in 3D, a hyperplane in N dimensions).

Example: "the more the sugar dose increases, the higher the diabetes risk rises" (direct relationship).

Suitable algorithms: linear regression, logistic regression, linear SVM.

Non-linear pattern

The boundary is curved, spiral-shaped, or has complex forms.

Example: "cancer risk increases with age, but also depends on complex combinations (genetics, lifestyle)".

Suitable algorithms: decision trees, random forests, neural networks, XGBoost.

WARNINGClassic pitfall: using a linear model on a non-linear problem = underfitting (the model is too simple). Conversely, using a very complex model on a simple problem = overfitting (the model learns noise). We will cover this in detail in chapter 06.

The pattern is not the ultimate rule: just an approximation

Important: an ML pattern is never an absolute rule. It is a statistical tendency. The model gives probabilities, not certainties.

Detected pattern	Cases where it works	Cases where it fails
"Email containing 'won 1M€' = spam"	95% of cases	Official lottery actually won
"Young + low balance = churn"	70% of cases	Student who stays a customer for 30 years
"Red round pixels = apple"	80% of cases	Tomato, strawberry, ball

This is why every ML model is evaluated on metrics (precision, recall, etc.). We do not seek perfection but the best possible performance — knowing there will always be errors.

Why dimensionality changes everything: the curse of dimensionality

When you have 2 features, you can draw a 2D plot and see the patterns. With 3 features, still possible (3D). But in practice, datasets often have 10, 100, sometimes 1000 features. Visualization becomes impossible.

go-further

This article covers the most useful excerpts — the complete Machine Learning for Beginners course (11 chapters, 44 lessons, corrected exercises and final project) takes you all the way.

./access-the-full-course free course: Mastering Claude Code

FAQ

How long does it take to learn Machine Learning for Beginners?

With a structured progression (11 chapters, 44 short and practical lessons), you reach an operational level in a few weeks at 30 to 60 minutes per day. The key is to practice each concept immediately.

Are there any prerequisites?

No prerequisites: the course starts from zero; every concept is introduced before being used.

Where to start concretely?

Reproduce the commands in this article, then follow the full Machine Learning for Beginners course: it chains the 44 lessons in order, with exercises and a final project.

./read-also

→ Machine Learning Simplified in practice: the code and commands that really matter → Python Machine Learning: the 9 key steps to go from zero to operational → Get started with Python scikit-learn: your first concrete step today

📬 Want to receive this type of guide every week? Subscribe for free — real code, zero fluff.

Training vs test — why split?

Learning objectives

The trap: testing on training data

The solution: the train/test split

Training set (train)

Test set (test)

Cross-validation (k-fold cross-validation)

Data leakage: the invisible trap

Typical examples

How to avoid it

Visualizing the model and its predictions

Learning objectives

Seeing the tree: the Tree Viewer widget

Making predictions: the Predictions widget

Finding patterns — visual intuition

Learning objectives

What is a pattern?

Visualization: a scatter plot and its boundary

Linear vs non-linear patterns

Linear pattern

Non-linear pattern

The pattern is not the ultimate rule: just an approximation

Why dimensionality changes everything: the curse of dimensionality

FAQ

Stay up to date