Cloud & DevOps

Dive into ML Infrastructure with Kubernetes: Your First Concrete Step Today

ML Infrastructure Kubernetes: The Essentials in One Article — Real Code, Diagrams and Concrete Steps, Excerpts from a 41-Lesson Course.

REHOUMA Haythem

12 Jun 2026 • 12 min read

The best way to learn ML Infrastructure Kubernetes is by doing. This article gives you a head start with practical excerpts from a 41-lesson course — enough to get your first result today.

tl;dr

Install the Kubernetes environment
Discover Kubernetes
Essential Kubernetes Objects
YAML Files and Configuration
Deploy an ML API with Flask

~$ cat ./parcours.md # ML Infrastructure Kubernetes — 11 chapters

Install the Kubernetes environment

→ Download Docker Desktop and kubectl→ Install Minikube and create a cluster+ 1 more lessons

Discover Kubernetes

→ What is Kubernetes and why use it?→ Kubernetes Architecture (Master and Workers)+ 1 more lessons

Essential Kubernetes Objects

→ Pods, ReplicaSets and Deployments→ Services and Networking+ 1 more lessons

YAML Files and Configuration

→ Anatomy of a Kubernetes YAML file→ ConfigMaps and Secrets+ 1 more lessons

Deploy an ML API with Flask

→ Flask Recap and creating an ML API→ Dockerfile and Kubernetes Deployment+ 1 more lessons

Deploy an ML API with FastAPI

→ FastAPI Recap and creating a prediction API→ FastAPI Deployment and HPA+ 1 more lessons

Storage and Persistence

→ What are Kubernetes Volumes?→ PersistentVolumes and PersistentVolumeClaims+ 1 more lessons

Helm and Package Management

→ Why Helm? The Kubernetes package manager→ Install Charts and customize+ 2 more lessons

🏁

Final project (+ 3 chapters along the way)

→ You leave with a concrete and demonstrable project

Final Project – Complete Step-by-Step Guide

Guide • 5 parts • Backend • Frontend • Helm • Monitoring • CI/CD

NOTEOverview — This guide walks you through the final project step by step. Follow each part in order to build a complete ML platform on Kubernetes.

Part 1: Backend API (FastAPI + ML Model)

1.1 Initialize the project

bash

mkdir -p ml-prediction-platform/{backend/{app,train,tests},frontend,helm,k8s/{security,monitoring},.github/workflows,docs}
cd ml-prediction-platform
git init

1.2 Train the model

output

# backend/train/train_model.py
import joblib
import numpy as np
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

accuracy = accuracy_score(y_test, model.predict(X_test))
print(f"Accuracy: {accuracy:.4f}")

model_info = {
    "model": model,
    "feature_names": list(iris.feature_names),
    "target_names": list(iris.target_names),
    "accuracy": accuracy,
    "version": "1.0.0"
}
joblib.dump(model_info, "model.pkl")

bash

cd backend/train
pip install scikit-learn joblib numpy
python train_model.py

1.3 Create the Pydantic schemas

output

# backend/app/schemas.py
from pydantic import BaseModel, Field
from typing import List

class PredictionRequest(BaseModel):
    features: List[float] = Field(..., min_length=4, max_length=4)
    class Config:
        json_schema_extra = {"example": {"features": [5.1, 3.5, 1.4, 0.2]}}

class PredictionResponse(BaseModel):
    prediction: str
    prediction_id: int
    confidence: float
    probabilities: dict
    model_version: str

class HealthResponse(BaseModel):
    status: str
    model_loaded: bool
    version: str

1.4 Create the model loading module

output

# backend/app/model.py
import os
import joblib
import numpy as np
import logging

logger = logging.getLogger(__name__)

class MLModel:
    def __init__(self):
        self.model = None
        self.feature_names = None
        self.target_names = None
        self.version = None
        self.loaded = False

    def load(self, path: str = None):
        path = path or os.getenv("MODEL_PATH", "/models/model.pkl")
        info = joblib.load(path)
        self.model = info["model"]
        self.feature_names = info["feature_names"]
        self.target_names = info["target_names"]
        self.version = info["version"]
        self.loaded = True
        logger.info(f"Model v{self.version} loaded")

    def predict(self, features: list) -> dict:
        X = np.array(features).reshape(1, -1)
        pred = self.model.predict(X)[0]
        proba = self.model.predict_proba(X)[0]
        return {
            "prediction": self.target_names[pred],
            "prediction_id": int(pred),
            "confidence": float(max(proba)),
            "probabilities": {
                n: float(p) for n, p in zip(self.target_names, proba)
            }
        }

ml_model = MLModel()

1.5 Create the FastAPI application

output

# backend/app/main.py
import os, time, logging
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from prometheus_client import Counter, Histogram, generate_latest
from starlette.responses import Response
from .model import ml_model
from .schemas import PredictionRequest, PredictionResponse, HealthResponse

logging.basicConfig(level=os.getenv("LOG_LEVEL", "INFO"))
app = FastAPI(title="ML Prediction API", version="1.0.0")
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"])

PREDICTIONS = Counter("predictions_total", "Total predictions", ["status"])
LATENCY = Histogram("prediction_latency_seconds", "Prediction latency")

@app.on_event("startup")
async def startup():
    ml_model.load()

@app.get("/health", response_model=HealthResponse)
async def health():
    return HealthResponse(
        status="healthy" if ml_model.loaded else "unhealthy",
        model_loaded=ml_model.loaded,
        version=ml_model.version or "unknown"
    )

@app.post("/predict", response_model=PredictionResponse)
async def predict(req: PredictionRequest):
    start = time.time()
    try:
        result = ml_model.predict(req.features)
        PREDICTIONS.labels(status="success").inc()
        LATENCY.observe(time.time() - start)
        return PredictionResponse(model_version=ml_model.version, **result)
    except Exception as e:
        PREDICTIONS.labels(status="error").inc()
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/metrics")
async def metrics():
    return Response(content=generate_latest(), media_type="text/plain")

ConfigMaps and Secrets

NOTEObjective — Learn how to externalize configuration and sensitive data for your ML applications using Kubernetes ConfigMaps and Secrets.

Learning objectives

TIPAt the end of this module — You will be able to master these essential skills.

1. Why externalize configuration?

TIPAnalogy — Imagine a chef who keeps recipes (code) separate from ingredients (configuration). Depending on the meal (environment), different ingredients are used with the same recipe. This is exactly the role of ConfigMaps.

In ML, your API needs parameters that vary by environment:

Development

Staging

Production

With ConfigMaps, you change configuration without rebuilding the Docker image.

2. ConfigMaps: storing configuration

2.1 Create a ConfigMap from literals

bash

# Create a ConfigMap with key-value pairs
kubectl create configmap ml-config \
  --from-literal=MODEL_NAME=iris_classifier \
  --from-literal=MODEL_VERSION=v2 \
  --from-literal=LOG_LEVEL=INFO \
  --from-literal=MAX_BATCH_SIZE=32

2.2 Create a ConfigMap from a file

First create a configuration file:

output

# config.properties
model.name=iris_classifier
model.version=v2
model.threshold=0.85
api.port=5000
api.workers=4
log.level=INFO

bash

# Create the ConfigMap from the file
kubectl create configmap ml-config --from-file=config.properties

# Create from an entire directory
kubectl create configmap ml-config --from-file=./config/

2.3 Declarative ConfigMap in YAML

output

apiVersion: v1
kind: ConfigMap
metadata:
  name: ml-config
  labels:
    app: ml-api
data:
  MODEL_NAME: "iris_classifier"
  MODEL_VERSION: "v2"
  LOG_LEVEL: "INFO"
  MAX_BATCH_SIZE: "32"
  FEATURE_COLUMNS: "sepal_length,sepal_width,petal_length,petal_width"
  config.yaml: |
    model:
      name: iris_classifier
      version: v2
      threshold: 0.85
    api:
      port: 5000
      workers: 4

NOTEMulti-line key — The | symbol lets you include an entire file as a key value. Very useful for complete configuration files.

3. Using ConfigMaps in Pods

3.1 As environment variables

output

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ml-api
  template:
    metadata:
      labels:
        app: ml-api
    spec:
      containers:
        - name: ml-api
          image: monregistry/ml-api:v1
          ports:
            - containerPort: 5000
          envFrom:
            - configMapRef:
                name: ml-config
          env:
            - name: SPECIFIC_KEY
              valueFrom:
                configMapKeyRef:
                  name: ml-config
                  key: MODEL_NAME

Method	Usage	Description
`envFrom`	All keys	Injects all keys from the ConfigMap as environment variables
`valueFrom`	Specific key	Injects a single key from the ConfigMap into a named variable

3.2 As a mounted volume

output

apiVersion: v1
kind: Pod
metadata:
  name: ml-pod-config
spec:
  containers:
    - name: ml-api
      image: monregistry/ml-api:v1
      volumeMounts:
        - name: config-volume
          mountPath: /app/config
          readOnly: true
  volumes:
    - name: config-volume
      configMap:
        name: ml-config

Anatomy of a Kubernetes YAML file

NOTEObjective — Understand YAML syntax and the structure of Kubernetes manifests to describe your ML infrastructure resources declaratively.

Learning objectives

TIPAt the end of this module — You will be able to master these essential skills.

1. Introduction to YAML

YAML stands for “YAML Ain’t Markup Language”. It is a human-readable data serialization format widely used for configuration.

TIPAnalogy — Think of YAML as a recipe. Each recipe describes ingredients (key-value pairs), steps (lists) and sections (nested maps). Kubernetes reads this recipe to “cook” your infrastructure.

1.1 Basic YAML rules

Rule	Description	Example
Indentation	Only spaces (never tabs), usually 2 spaces	`key: value`
Key-value	Separated by `:` followed by a space	`name: my-pod`
Lists	Prefixed by a dash `-`	`- item1`
Comments	Start with `#`	`# This is a comment`
Strings	Quotes optional unless special characters	`name: "my:pod"`
Booleans	`true` / `false`	`enabled: true`

1.2 Key-value pairs

The simplest structure in YAML — a key associated with a value:

output

name: flask-ml-api
version: "1.0"
replicas: 3
debug: false

1.3 Lists (sequences)

Lists use a dash - for each item:

output

frameworks:
  - scikit-learn
  - tensorflow
  - pytorch
  - fastapi

1.4 Nested maps (dictionaries)

Maps let you create hierarchical structures:

output

server:
  host: 0.0.0.0
  port: 5000
  options:
    debug: true
    workers: 4

WARNINGCaution — Indentation is critical in YAML. A single-space error can break the entire file. Always use 2 spaces and never tabs.

1.5 YAML data types

Strings

output

simple: hello
quotes: "world"
multi: |
  line 1
  line 2

Numbers

output

integer: 42
float: 3.14
scientific: 1e+6
octal: 0o14

Special

output

true: true
false: false
null: null
date: 2026-03-05

go-further

This article covers the most useful excerpts — the complete ML Infrastructure Kubernetes course (12 chapters, 41 lessons, corrected exercises and final project) takes you all the way.

./access-the-full-course free course: Mastering Claude Code

FAQ

How long does it take to learn ML Infrastructure Kubernetes?

With a structured progression (12 chapters, 41 short and practical lessons), you reach an operational level in a few weeks at 30–60 minutes per day. The key is to practice each concept immediately.

Are there any prerequisites?

It helps to be comfortable with the fundamentals of the domain: this content goes in depth with real-world cases.

Where to start concretely?

Reproduce the commands in this article, then follow the full ML Infrastructure Kubernetes course: it sequences the 41 lessons in order, with exercises and a final project.

./read-also

→ Docker Containerization explained simply (with diagrams and real code)→ Mastering Linux explained simply (with diagrams and real code)→ Python Security Ports Linux: the 9 key steps to go from zero to operational

📬 Want to receive this type of guide every week? Subscribe for free — real code, zero fluff.

Final Project – Complete Step-by-Step Guide

Part 1: Backend API (FastAPI + ML Model)

1.1 Initialize the project

1.2 Train the model

1.3 Create the Pydantic schemas

1.4 Create the model loading module

1.5 Create the FastAPI application

ConfigMaps and Secrets

Learning objectives

1. Why externalize configuration?

Development

Staging

Production

2. ConfigMaps: storing configuration

2.1 Create a ConfigMap from literals

2.2 Create a ConfigMap from a file

2.3 Declarative ConfigMap in YAML

3. Using ConfigMaps in Pods

3.1 As environment variables

3.2 As a mounted volume

Anatomy of a Kubernetes YAML file

Learning objectives

1. Introduction to YAML

1.1 Basic YAML rules

1.2 Key-value pairs

1.3 Lists (sequences)

1.4 Nested maps (dictionaries)

1.5 YAML data types

Strings

Numbers

Special

FAQ

Stay up to date