Data & Big Data

Python Matplotlib Seaborn Explained Simply (with Diagrams and Real Code)

Python Matplotlib Seaborn: The Essentials in One Article — Real Code, Diagrams and Concrete Steps, Excerpts from a 37-Lesson Course.

REHOUMA Haythem

12 Jun 2026 • 14 min read

A no-nonsense guide: Python Matplotlib Seaborn broken down with diagrams, concrete examples and tested commands. Everything comes from a structured 11-chapter course — here are the highlights.

tl;dr

Introduction and Installation
Matplotlib Basics
Essential Matplotlib Charts
Customization and Styles
Subplots and Complex Figures

~$ cat ./parcours.md # Python Matplotlib Seaborn — 10 chapters

Introduction and Installation

→ Course presentation→ Install Python, Anaconda, Jupyter and the libraries+ 1 more lessons

Matplotlib Basics

→ Anatomy of a Matplotlib figure→ Pyplot vs object-oriented API+ 1 more lessons

Essential Matplotlib Charts

→ Line charts (line charts)→ Bar charts (bar charts)+ 2 more lessons

Customization and styles

→ Colors, markers and line styles→ Titles, legends, annotations and text+ 1 more lessons

Subplots and complex figures

→ Subplots with plt.subplots()→ GridSpec for asymmetric layouts+ 1 more lessons

Introduction to Seaborn

→ What is Seaborn and how does it differ from Matplotlib?→ Install Seaborn and load the built-in datasets+ 1 more lessons

Statistical Visualizations Seaborn

→ Distributions with histplot, kdeplot and displot→ Boxplot, violinplot and stripplot+ 2 more lessons

Multivariate Visualizations Seaborn

→ pairplot and correlation matrices→ heatmap — heatmaps+ 1 more lessons

🏁

Final project (+ 2 chapters along the way)

→ You leave with a concrete and demonstrable project

Exploration and Initial Visualizations (EDA)

NOTEObjective — Perform exploratory data analysis (EDA) on the sales dataset to identify key insights to highlight in the final dashboard.

Why perform EDA before building the dashboard?

TIPGolden rule — Never design a dashboard without first understanding the data. EDA reveals patterns, outliers, and insights that must be emphasized.

In this lesson we will create 5 exploratory charts to understand:

Common setup

output

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme(
    style="whitegrid",
    context="notebook",
    palette="viridis",
    font_scale=1.05,
)

df = pd.read_csv("ventes_2024.csv", parse_dates=["date"])
print(df.info())
print(df.describe())

Chart 1: Average basket distribution

output

fig, ax = plt.subplots(figsize=(10, 5))
sns.histplot(data=df, x="vente_eur", kde=True, bins=40, color="#7c3aed", ax=ax)

mean_vente = df["vente_eur"].mean()
median_vente = df["vente_eur"].median()
ax.axvline(mean_vente, color="red", linestyle="--", linewidth=2, label=f"Mean: {mean_vente:.0f} EUR")
ax.axvline(median_vente, color="orange", linestyle="--", linewidth=2, label=f"Median: {median_vente:.0f} EUR")

ax.set_title("Average basket distribution", fontweight="bold")
ax.set_xlabel("Order amount (EUR)")
ax.set_ylabel("Number of transactions")
ax.legend()
sns.despine()
plt.tight_layout()
plt.show()

TIPEDA Insight #1 — The distribution is right-skewed. The mean is pulled upward by a few very large orders, while the median better reflects typical behavior.

Chart 2: Monthly revenue trend

output

monthly = df.groupby("mois").agg(
    ca=("vente_eur", "sum"),
    nb_cmd=("vente_eur", "count"),
).reset_index()

fig, ax = plt.subplots(figsize=(11, 5))
sns.lineplot(data=monthly, x="mois", y="ca", marker="o", linewidth=2.5, color="#7c3aed", ax=ax)
ax.fill_between(monthly["mois"], monthly["ca"], alpha=0.15, color="#7c3aed")

best_month = monthly.loc[monthly["ca"].idxmax()]
ax.annotate(
    f"Peak: {best_month['ca']:,.0f} EUR",
    xy=(best_month["mois"], best_month["ca"]),
    xytext=(best_month["mois"], best_month["ca"] + 1500),
    ha="center", fontsize=11, fontweight="bold", color="red",
    arrowprops=dict(arrowstyle="->", color="red")
)

ax.set_title("Monthly revenue evolution 2024", fontweight="bold")
ax.set_xlabel("Month"); ax.set_ylabel("Revenue (EUR)")
ax.set_xticks(range(1, 13))
sns.despine()
plt.tight_layout()
plt.show()

TIPEDA Insight #2 — Activity peaks during the festive season (November/December); relative troughs in February–March.

Chart 3: Performance by category

output

cat_perf = df.groupby("categorie").agg(
    ca=("vente_eur", "sum"),
    marge=("marge_eur", "sum"),
).reset_index().sort_values("ca", ascending=True)

fig, ax = plt.subplots(figsize=(10, 5))
y_pos = range(len(cat_perf))
ax.barh(y_pos, cat_perf["ca"], color="#a78bfa", label="Revenue", alpha=0.8)
ax.barh(y_pos, cat_perf["marge"], color="#7c3aed", label="Margin", alpha=0.9)

for i, (ca, marge) in enumerate(zip(cat_perf["ca"], cat_perf["marge"])):
    ax.text(ca + 2000, i, f"{ca:,.0f} EUR", va="center", fontsize=10, fontweight="bold")

ax.set_yticks(y_pos)
ax.set_yticklabels(cat_perf["categorie"])
ax.set_title("Revenue and margins by product category", fontweight="bold")
ax.set_xlabel("Amount (EUR)")
ax.legend(loc="lower right")
sns.despine()
plt.tight_layout()
plt.show()

TIPEDA Insight #3 — Electronics and clothing generate the most revenue, but their margin-to-revenue ratios differ. Sport, despite lower revenue, can deliver a better relative margin.

Chart 4: Store performance (boxplot)

output

fig, ax = plt.subplots(figsize=(11, 5))
order = df.groupby("magasin")["vente_eur"].median().sort_values(ascending=False).index

sns.boxplot(data=df, x="magasin", y="vente_eur", order=order,
            palette="viridis", hue="magasin", legend=False, ax=ax)
sns.stripplot(data=df, x="magasin", y="vente_eur", order=order,
              color="black", alpha=0.15, size=2, ax=ax)

ax.set_title("Basket distribution by store (sorted by median)", fontweight="bold")
ax.set_xlabel("Store"); ax.set_ylabel("Basket (EUR)")
sns.despine()
plt.tight_layout()
plt.show()

TIPEDA Insight #4 — Paris and Lyon stores show higher median baskets. Bordeaux exhibits more high-end outliers (occasional large orders).

Chart 5: Numeric correlations (heatmap)

output

numeric_cols = ["vente_eur", "marge_eur", "nb_articles", "mois", "trimestre"]
corr = df[numeric_cols].corr()

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(corr, annot=True, fmt=".2f", cmap="vlag", center=0,
            square=True, linewidths=0.5, cbar_kws={"shrink": 0.7}, ax=ax)
ax.set_title("Correlation matrix of numeric variables", fontweight="bold")
plt.tight_layout()
plt.show()

First visualization script

NOTEObjective — Create your very first Matplotlib chart: plot the function y = sin(x) between 0 and 2π. This exercise validates your installation and gives you a concrete foundation for the rest of the course.

Learning objectives

TIPBy the end of this module — You will be able to write a complete Python script that generates data with NumPy, visualizes it with Matplotlib, and saves the result as an image. You will also understand why we always start with import numpy as np and import matplotlib.pyplot as plt.

The complete script: 8 lines for your first chart

Here is the script we will dissect together. Copy it into a new Jupyter notebook:

output

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)

plt.plot(x, y)
plt.title("Ma premiere fonction sinus")
plt.xlabel("x (radians)")
plt.ylabel("sin(x)")
plt.grid(True)
plt.show()

Run the cell (Shift + Enter). You should see a beautiful sinusoidal curve oscillating between −1 and +1.

TIPWell done! You have just created your first Python chart. Now let’s understand what happens line by line.

Line-by-line breakdown

Lines 1–2: the imports

output

import numpy as np
import matplotlib.pyplot as plt

Two universal conventions:

WARNINGGolden rule — Always write import matplotlib.pyplot as plt and never import matplotlib as plt. The pyplot module contains all the plot(), title(), etc. functions.

Line 3: generate the x-axis

output

x = np.linspace(0, 2 * np.pi, 100)

np.linspace(start, stop, n) returns an array of n evenly spaced values between start and stop. Here: 100 points between 0 and 2π (≈ 6.28).

Result: x = [0.0, 0.063, 0.127, 0.190, ..., 6.283]

Line 4: compute y = sin(x)

output

y = np.sin(x)

NumPy applies sin() to every element of the array x in a single operation. This is NumPy’s vectorized magic: no loops, extremely fast.

Result: y = [0.0, 0.063, 0.127, ..., -0.0]

Line 5: plot the curve

output

plt.plot(x, y)

plt.plot(x, y) draws a line connecting each point (x[i], y[i]). This is the most-used function in all of Matplotlib.

Lines 6–8: title, labels and grid

output

plt.title("Ma premiere fonction sinus")
plt.xlabel("x (radians)")
plt.ylabel("sin(x)")
plt.grid(True)

First line chart with real data

NOTEObjective — Move from synthetic data (sine, cosine) to real data. Load a CSV file with Pandas, understand its structure, and create a professional line chart.

Learning objectives

TIPBy the end of this module — You will be able to load a CSV, manipulate it with Pandas, plot one or more time series, and export the result cleanly.

Our dataset: fictional monthly sales

For this exercise we create a small DataFrame with 12 months of sales for 3 different products:

output

import pandas as pd
import matplotlib.pyplot as plt

data = {
    "Mois": ["Jan", "Fev", "Mar", "Avr", "Mai", "Juin",
             "Juil", "Aout", "Sep", "Oct", "Nov", "Dec"],
    "Produit A": [120, 135, 148, 160, 175, 210,
                   250, 245, 200, 170, 140, 290],
    "Produit B": [80, 85, 90, 95, 100, 110,
                   130, 135, 120, 100, 90, 180],
    "Produit C": [50, 55, 60, 70, 85, 100,
                   120, 115, 95, 75, 60, 150],
}
df = pd.DataFrame(data)
print(df.head())

Displayed result:

output

Mois  Produit A  Produit B  Produit C
0  Jan        120         80         50
1  Fev        135         85         55
2  Mar        148         90         60
3  Avr        160         95         70
4  Mai        175        100         85

TIPWhy a DataFrame? — A Pandas DataFrame is a structured table (named columns, indexed rows). It is the standard format for feeding data to Matplotlib or Seaborn.

Plot a single column

output

fig, ax = plt.subplots(figsize=(10, 5))

ax.plot(df["Mois"], df["Produit A"], color="purple", linewidth=2)
ax.set_title("Ventes mensuelles du Produit A", fontsize=14)
ax.set_xlabel("Mois")
ax.set_ylabel("Unites vendues")
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

You obtain a clean chart showing sales seasonality (peak in July, surge in December).

Plot multiple series on the same chart

output

fig, ax = plt.subplots(figsize=(12, 6))

ax.plot(df["Mois"], df["Produit A"], label="Produit A", linewidth=2, marker="o")
ax.plot(df["Mois"], df["Produit B"], label="Produit B", linewidth=2, marker="s")
ax.plot(df["Mois"], df["Produit C"], label="Produit C", linewidth=2, marker="^")

ax.set_title("Ventes mensuelles par produit", fontsize=14, pad=15)
ax.set_xlabel("Mois", fontsize=12)
ax.set_ylabel("Unites vendues", fontsize=12)
ax.legend(loc="upper left", fontsize=11)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Three new techniques applied:

Available markers

Code	Marker	Code	Marker
`"o"`	Circle	`"s"`	Square
`"^"`	Triangle up	`"v"`	Triangle down
`"<"`	Triangle left	`">"`	Triangle right
`"D"`	Diamond	`"d"`	Thin diamond
`"*"`	Star	`"+"`	Plus
`"x"`	Cross	`"."`	Point
`"P"`	Filled plus	`"X"`	Filled cross

Load a real CSV file

In real life your data lives in a .csv file. Here is how to load it:

output

# If the file is in the same folder
df = pd.read_csv("ventes.csv")

# If the file is on the web
url = "https://raw.githubusercontent.com/exemple/data/main/ventes.csv"
df = pd.read_csv(url)

# With a different separator (semicolon)
df = pd.read_csv("ventes.csv", sep=";")

# With automatic date parsing
df = pd.read_csv("ventes.csv", parse_dates=["date"])

print(df.head())
print(df.dtypes)

NOTEDebug tip — Always check df.dtypes after loading. If a column expected to be numeric appears as object, you probably have French decimal commas or empty cells to clean.

Pandas + Matplotlib shortcut: built-in wrapper

Pandas ships with its own Matplotlib wrapper. You can plot directly from a DataFrame:

output

df.set_index("Mois").plot(figsize=(10, 5), marker="o")
plt.title("Ventes par produit (style Pandas)")
plt.ylabel("Unites")
plt.grid(True, alpha=0.3)
plt.show()

va-plus-loin

This article covers the most useful excerpts — the complete Python Matplotlib Seaborn course (11 chapters, 37 lessons, corrected exercises and final project) takes you all the way.

./acceder-au-cours-complet free course: Mastering Claude Code

FAQ

How long does it take to learn Python Matplotlib Seaborn?

With a structured progression (11 chapters, 37 short practical lessons), you reach an operational level in a few weeks at 30–60 minutes per day. The key is to practice each concept immediately.

Are there any prerequisites?

Basic computer literacy is enough. If you can use a terminal and read simple code, you are ready.

Where should I start concretely?

Reproduce the commands in this article, then follow the complete Python Matplotlib Seaborn course: it sequences the 37 lessons in order, with exercises and a final project.

./a-lire-aussi

→ AWS Data Engineering Bootcamp explained simply (with diagrams and real code)→ Get started with AWS Real-Time Data: your first concrete step today → Python Data Science: the 9 key steps from zero to operational

📬 Want to receive this kind of guide every week? Subscribe for free — real code, zero fluff.

Exploration and Initial Visualizations (EDA)

Why perform EDA before building the dashboard?

Common setup

Chart 1: Average basket distribution

Chart 2: Monthly revenue trend

Chart 3: Performance by category

Chart 4: Store performance (boxplot)

Chart 5: Numeric correlations (heatmap)

First visualization script

Learning objectives

The complete script: 8 lines for your first chart

Line-by-line breakdown

Lines 1–2: the imports

Line 3: generate the x-axis

Line 4: compute y = sin(x)

Line 5: plot the curve

Lines 6–8: title, labels and grid

First line chart with real data

Learning objectives

Our dataset: fictional monthly sales

Plot a single column

Plot multiple series on the same chart

Available markers

Load a real CSV file

Pandas + Matplotlib shortcut: built-in wrapper

FAQ

Stay up to date